China’s New Tennis Robot Reveals the Next Step for Humanoid Robots

10:29

China’s New Tennis Robot Reveals the Next Step for Humanoid Robots

TheAIGRID 19.03.2026 4 987 просмотров 153 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

🌐Subscribe To My Newsletter - https://aigrid.beehiiv.com/subscribe Get your Free AGI Preparedness Guide - https://theaigrid.kit.com/agi 🎓 Learn AI In 10 Minutes A Day - https://www.skool.com/theaigridacademy 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid Links From Todays Video: https://zzk273.github.io/LATENT/ Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com Music Used LEMMiNO - Cipher https://www.youtube.com/watch?v=b0q5PR1xpA0 CC BY-SA 4.0 LEMMiNO - Encounters https://www.youtube.com/watch?v=xdwWCl_5x2s #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

So apparently robots can now play tennis like humans. And we have to talk about how this even happened. So it actually all starts with a group of researchers spread out at some of China's top institutions, Singua University, Pecking University, a robotics company called Galbot, the Shanghai AI laboratory and amongst others. Now these people have been working on a problem that sounds simple but is actually one of the hardest challenges in all of robotics. getting a humanoid robot, not just an arm, not just a wheeled machine, an actual two-legged robot to do something real athletic, something in real time, and something that requires coordination that humans take for granted. Now, the sport they chose for this is tennis. And if you think about it for a second, tennis is that kind of nightmare scenario for a robot. The ball comes at you at 30 m/s. That's over 60 mph. You have to track it in real time. Move your entire body into position. Plant your feet. Swing the racket and make contact with a ball that is only touching your racket strings for a few milliseconds. Humans spend years learning how to do this. And some people spend their entire lives and they never get particularly good at this. Now, these researchers decided to teach a robot how to do it. Now, it wasn't just any robot either. It was the Unistry G1. Now, if you're not familiar with this, which you probably are, this is the small humanoid robot made by a small Chinese company called Unitry. It's about 127 cm tall. That's about 4'2. So, we're talking about a robot the size of a primary school kid. Now, 29 degrees of freedom, which basically means 29 different joints. It can move independently. And they replaced its right hand with a 3D printed connector, so it could grip a fullsized tennis racket. Now, you've already maybe seen this where, you know, you've had the Unitary G1 actually playing table tennis, but remember, table tennis is one thing. You're standing in the same spot. Basically, the ball is slow, the table is small, and tennis is a completely different beast. You're going to need full court movement. to sprint. handle balls coming at three, four, five, five times tosser than a ping-pong ball. So, this is where we have the system called latent. So, this stands for learns athletic humanoid tennis skills from imperfect human motion data. And yes, that is an absolutely brutal acronym. But what the system actually does is genuinely incredible. So, here's the problem that they were facing. If you want to teach a robot to play tennis, the obvious approach is to get data from really good tennis players, record a bunch of professional matches, capture every movement perfectly, feed it all into the system, and let the AI do its thing, learn from the best. But that doesn't work. Okay? And the reason is pretty interesting. You see, professional tennis data from real matches is incredibly hard to capture with the precision that you need for robotics. You need full body motion tracking of elite players during actual competitive play. And even if you could somehow get that data, a human body and a robot body are completely different. Different proportions, different joint limits, you can't just copy a human movement onto a robot and expect it to work. The team tried something different. Instead of the perfect data from the pros, they used imperfect data from amateurs. They brought in five amateur tennis players, put them in a small motion capture area. We're talking 3 m x 5 m, and that's about 17 times smaller than an actual tennis court. And then they just had these five people hit forehands, backhands, do some lateral shuffles, practice crossover steps, basic stuff. Just five hours of data total. And that's it. That's all the human data that they used. 5 hours, five amateurs, and a capture area the size of a large living room. And you're probably thinking, well, how on earth do you go from that to a robots that's actual playing tennis rallies? Because those motion fragments are incomplete. They're from a tiny space. And those players aren't even playing real tennis. They're just doing isolated movements. Now, this is where the AI comes in, the brain. So, this is what the researchers built. It isn't just a robot that copies human movements. It's a system that takes those imperfect fragments and figures out how to stitch them together into something that works on a full tennis court. So, this is how it works. It's got like a three layer architecture. And first, you've got on the left hand side, you can see here you've got the motion tracker. This takes those raw human movements and translates them into something that the robot can physically do. Because remember, the robot's body is different from a human body. So, its legs are shorter, its arms move differently. So the AI has to watch a human swing a forehand and figure out what the robot equivalent of that swing looks like. Not a perfect copy, a functional translation. Then on top of that sits what they call a latent action space. And this is the clever bit. Instead of the robot learning raw joint movement, it learns a compressed representation movement. Think of it like this. Instead of memorizing every single muscle twitch in a tennis swing, the robot learns the essence of what a forehand feels like, the general shape of it, and then fills in the details itself. This means it can adapt, generate movements it's never actually seen in the training data. And then there's the highle policy. This is the brain. This is the part that sees

Segment 2 (05:00 - 10:00)

the ball coming, predicts where it's going to end, decides whether to hit a forehand or backhand, and coordinates the robot's entire body to get into position and make that shot. Now, all of this, the motion tracking, the latent space, the highle policy, the risk correction, all of this gets trained in simulation first. The robot learns to play tennis entirely in a virtual environment. millions of rallies, millions of shots, all simulated, and then they transfer it to the real robot. And normally, this is where everything falls apart. This is actually one of those biggest unsolved problems in robotics. It's called the sim to real gap. Because simulation is perfect, reality is messy. The floor might be slightly uneven. The ball bounces differently. The wind exists. The robot's joints have tiny imperfections. There are a thousand tiny variables that the simulation doesn't account for. And historically policies work that you know that work beautifully in the simulation. They completely fall apart the moment you put them on a real robot. So the team added randomization. And this is a really elegant trick. They deliberately made the simulation imperfect. They randomized the physics. They added noise to the observation, changing the friction, the ball behavior, the robot's own mass distribution. They basically said, "Look, we're going to make it so chaotic that, you know, training reality feels easy by that when the robot finally steps onto the real court and everything is slightly wrong compared to the simulation, it doesn't just panic, it adapts, and it's already seen worse. Way worse. " And of course, as you guys can see, the results are kind of absurd. In the real world, the robot hits forehand with a 91% success rate, backhands at a 78% success rate, and it can sustain multi-shot rallies with actual human players. Not one hit, nor two hit, continuous back and forth rallies. That's what we're talking about here. This is a robot that's 3 and 1/2 ft tall. It weighs about 35 kg and it's sprinting across the court at over 6 m/s. That's faster than the average person jogs. And it's tracking a ball in some cases that's coming at it at 15 to 30 m a second. And it's making contact with a racket during a window that lasts just a few milliseconds. Essentially zero margin for error. And in simulation, the numbers are even more insane. 97% forehand success rate, 82% backhand, and every other method they compared it against basically failed. Standard reinforcement learning couldn't do it. Other motion learning approaches couldn't do it. The baselines they tested it against either couldn't sustain a rally at all or they had success rates so low they weren't even close to competitive. And remember, this is from 5 hours of amateur data and a capture space the size of a living room. And this that's what makes the paper hit so different. They didn't need that perfect data. They didn't need professional athletes. They didn't need years of recording. They just took scrappy incomplete imperfect motion clips and built a system that turns them into something that genuinely looks athletic. I came across a Reddit thread for this paper and it got over 2 and a half thousand up votes, which for a robotics paper is massive. And one of the top comments said this that got them thinking about the point where we'll all be able to compete against AI in any sport the same way we can basically just set the difficulty of a video game. Now you might be thinking, okay, a robot that can play tennis, is this just another flashy demo? Does this actually matter for the real world? So the researchers are actually very clear about this in the paper. Tennis is a proof of concept. The real point is that this learning is from messy data. Using imperfect data is a incredible bottleneck. Think about it. Okay? If you're able to do that, you could basically work for any robot. And it could work for any physical task. Soccer, parkour, warehouse work, disaster rec, you know, anything that requires a humanoid robot to move fast, react in real time, and coordinate its entire body. And the data bottleneck has always been the problem. Everyone knew that if you could get enough perfect data, you could train a robot to do amazing things. But the perfect data is expensive, slow to collect, and often impossible to get. What this team showed is that you don't need perfect data. You need a smart enough architecture to work with imperfect data. And that changes the equation entirely. Think about what this means for the humanoid robotics companies right now. Companies like Figure and Tesla, they're all working on getting humanoid robots into factories and warehouses. And one of the biggest questions has been how you actually teach those robots to move in useful ways without having to hand program every single motion. Now, this is where Leighton suggests the answer. You just capture some humans doing the task and it doesn't have to be perfect. complete. And a few hours in a small room and then you let the AI figure out how to translate that into robot movement. And that's a massive deal because collecting five hours of amateur data is cheap, fast, and anyone can do it. And if this approach actually generalizes, you could potentially teach a humanoid robot a new physical skill every single week. Now, the researchers actually also mention a few things that they want to fix. One of the fix is right now that the robot relies on external motion capture system to track the ball and know where it is on the court. And that's fine for a lab demo, but obviously you can't put motion capture cameras on every construction site or warehouse floor. So the next step is getting the robot to use its own eyes, active vision, onboard cameras, making the robot fully self-contained.

Segment 3 (10:00 - 10:00)

And of course, the next steps are to push towards multi- aent scenarios. Right now, it's one robot playing against one human, but imagine two robots playing against each other, a robot doubles partner. That would require a whole new level of coordination and strategy. And then there's a generalization question. Can the same system learn soccer from a few hours of amateur footage? Can it learn to dance? Can it learn martial arts? The architecture doesn't have anything tenn specific baked into it. So in theory it should be adaptable but that is for another

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник