Nvidias NEW Robotics Breakthroughs Accelerates physics by 10,000x (Nvidia Hover)
10:06

Nvidias NEW Robotics Breakthroughs Accelerates physics by 10,000x (Nvidia Hover)

TheAIGRID 08.11.2024 19 094 просмотров 438 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Prepare for AGI with me - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ Links From Todays Video: Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

so now Nvidia once again have done something absolutely crazy which is showing the world that they are not just a GPU Manufacturing Company they're actually at the head of major AI research now essentially this is referring to their recent research which is of course called gear so gear is basically the team that is building generalist embodied agents in the real world so this is called G stands for generalist embodied agent research it says building generally capable agents in many worlds virtual and real now essentially I'm going to explain to you guys what they is because I want to show you guys and explain to you guys exactly what their research does so the goal of the inidia gear team which is led by Dr Jim fan and Professor yuk zuo is basically to build Foundation models for embodied agents in Virtual and physical worlds and their agenda research encompasses multimodal Foundation models like LMS for planning and reasoning Vision large language models and World models trained on internet scale data sources of course general purpose robots robotic models and systems that enable robust Locomotion dexterous manipulation in complex environments and Foundation agents in Virtual Worlds and large action models that autonomously explore and continuously bootstrap their capabilities across different games and simulations and of course simulation infrastructure and synthetic data and so now we get on to the recent research that they did which is pretty crazy so they recently published this paper and don't worry it's not going to be a video that basically just looks at screenshots of a paper it's actually going to show you guys some really cool simulations that they did and why this is truly the future of Robotics so basically this is called hover okay and this is called versatile neural whole body controller for humanoid robots or the less complicated version is essentially just basically one brain for all robot movement so I'm want to explain to you guys exactly what this means so basically the problem with robotics is that we have a major problem okay so think about it like this okay so let's say you were a musician okay and you're watching a talented musician switch between playing a piano violin drums and their body is basically able to you know naturally adapt to every instrument now basically imagine a robot trying to do the exact same thing but instead of you know doing that fluidly it has to reboot its brain for every single task and that's kind of how robotics works every single time we ask a robot to switch from walking to grabbing objects to dancing it's basically like forcing it to learn an entirely new language from scratch and this is where po comes in this is the world's first Universal controller for humanoid robots so you can kind of think of this as like giving robots their own version of human intuition so basically until now until today you know robots needed different control systems for every single task you needed a control system for walking you needed another one for picking up objects and maintaining balance it's basically like having to switch brains for every single different movement now of course that is completely clunky and not efficient at all and this fragmented approach not only means that you know robots are less efficient but also limited their ability to adapt to new situations which is of course something that humans do naturally without thinking now basically so basically traditional robots face three major challenges they need special programming to just walk without falling you've seen numerous robots like fall over and have to you know use like some kind of rope to hold them up and of course you need another kind of you know policy to manipulate objects with your hands this is something that humans can do completely easily and of course you need another system to coordinate full body movement and the reason this is just completely insane when you think about it is like you know you have to basically drive a car where you need three different drivers one for the steering one for the pedals and one for changing gears but of course this is where Hoover comes in which is like it's just one unified system that learns you know pretty much like how a human does and this it actually like learns through how we have advanced Ai and motion capture technology it basically watches human movements and learns to replicate them naturally so instead of switching between different Control Systems Hoover acts I keep saying Hoover it's hover it's not Hoover acts more like a human brain coordinating multiple movement simultaneously while maintaining perfect balance and precision enabling more efficient humanoid robotics so just as a person can walk while carrying a cup of coffee and having a conversation hover let robot smoothly combine different actions without missing a beat now Dr Jim fan who is a senior researcher at Nvidia where he leads the gear team he actually said some really cool things that are going to just completely blow your mind to show you guys why this research is absolutely incredible so he talks about how not every Foundation model needs to be gigantic and how they trained a 1. 5 million parameter neuron Network to control the body of a humanoid robot and of course he basically explains that look it takes a lot of subconscious processing for us humans to walk maintain balance and maneuver our arms and legs into desired positions and we capture this subconsciousness in Hover a single model that learns how to coordinate the Moes of humanid robot to support Locomotion and manipulation now you might be thinking 1. 5 million parameters is completely small but you're going to be surprised at the results and the crazy thing about this is that like this enabled them to do things 10,000 times faster so one of the things that's crazy about Nvidia is that they have some GPU simulation that

Segment 2 (05:00 - 10:00)

enables you know 10,000 times faster results so basically imagine you wanted to teach a robot how to move and perform certain tasks but instead of spending an entire actually year you know training the robot in the real world which is going to be incredibly slow and of course expensive because a robot is going to drop they found a way to create a matrix-- like virtual training ground using Nvidia special computer simulation called ISAC okay now this simulation is so powerful that it can squeeze an entire Year's worth of robot training in just to 50 minutes of real world time that's basically like you could learn to become a Master Chef in the time that it takes to watch a TV episode now this simulation which is from Nvidia is absolutely insane it runs 10,000 times faster than real life which means while we're taking one step the robot in the simulation has already practiced that step 10,000 times and you can think about it like having a time machine you can practice something over and over you can make all those mistakes and you can do that incredibly quickly now the craziest thing about all of this okay is that like they State okay that this entire thing is zero shot okay which basically means that like after they get this done like after the robot has finished training for an hour or 50 minutes okay which is one year of trailing which is literally only 50 minutes the neuronet then transfers this zero shot to the real world without fine shooting so basically you can take a robot that's just purely training simulation and put it in the physical environment and it works completely like there's no fine tuning needed you don't need to you know fix anything that's gone wrong weird it immediately starts to work which I think is just absolutely insane transferring zero sh without any fine tuning is absolutely insane now do you want to know the craziest thing about all of this okay and this is the thing that I was like okay we're starting to actually get somewhere here is the fact that like one of the lead researchers on this project actually talks about how the fact that this system the hover generalist system actually outperforms specialist policies trained for specific modes so hover which can actually do many things actually performs better than systems designed just to do one specific task and when you think about it this is kind of counter because typically Specialists outperform generally and the researchers believe that this Superior performance happens because it leverages shared physical knowledge across different modes and it apparently learns fundamental principles that apply to every single movement for example how to keep balance while moving and you can see here how crazy this is you can see that like the specialist in blue is not hitting all of those points or as far into those points as it need to go but the green areas cover a more comprehensive area which is just genuinely really surpising surprising now of course you guys might be wondering about how they managed to train this don't worry I'm going to break this down into the most simplest way you can understand it basically this diagram just begins with a data set of human-like movements this is where you can see on the left this is where you've got the retargeted motion data set and this data set is a complete collection of motions that the robot should learn to replicate which is things like walking moving its arm or balance now of course then we have the oracal policy which is basically like the teacher or the expert model it's been trained with a lot of data and has a deep understanding of how robot should move and this policy is complex and not yet direct ready for use on the robot then you've got propri acception and command masking basically if you want to help this student learn the system breaks down the movements into simpler task so it uses uh two ways to do that it's got mode mask which is where this decides on the specific Movement Like arm or leg motion that the students should focus on and then it's got the sparity mask which chooses only the certain important parts of each movement to make the learning process simpler and less overwhelming and then of course it's got learning through supervision where the student tries to copy the actions of the Oracle policy as closely as possible I and this copying or supervised learning means that the student gets feedback whenever the actions don't match the experts and over time the student learns to make movements just like the expert once all of this is done once the hover policy has been learned enough from the Oracle policy it can use these learn skills in the real world without further adjustment this is all possible because it's been trained in a way that translates into real life robot movement in short this entire diagram basically just shows you how an expert model teaches this hover policy to learn and how it trans transfers it to a real now this image basically just shows how hover a versatile control system for humanoid robots entirely works it basically allows the robot to handle different types of movement commands from various control devices so basically what you're seeing at the top is where you can see that the system can receive inputs from a variety of devices like VR headsets cameras exoskeletons robot arms even joysticks and then these devices send information about different types of movements like the robot should perform like moving its head hands or whole body and then basically this is where it begins to track different modes from different body parts which is the middle row so it's got kinematic position tracking which is where you're focusing on you know parts of the body should be positioned then you've got local joint angle tracking which is you know shown in yellow where you're controlling the exact angles of joints like the elbows or knees then of course you've got root tracking which is where you're managing the central movement of the body like overall balance or moving the Torso basically I'm not going to get into it too much but in simple terms this entire setup allows hover to receive commands from multiple sources figure out which parts of the robot to move and then execute those movement controlled and coordinated way

Segment 3 (10:00 - 10:00)

fil IDE comp of movement and control

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник