# Googles New Humanoid Robots Are Incredible - Gemini 2 Robotics

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=MapsWdyanGk
- **Дата:** 13.03.2025
- **Длительность:** 11:58
- **Просмотры:** 100,699
- **Источник:** https://ekstraktznaniy.ru/video/13219

## Описание

Join my AI Academy - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:
https://deepmind.google/discover/blog/gemini-robotics-brings-ai-into-the-physical-world/

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

Music Used

LEMMiNO - Cipher
https://www.youtube.com/watch?v=b0q5PR1xpA0
CC BY-SA 4.0
LEMMiNO - Encounters
https://www.youtube.com/watch?v=xdwWCl_5x2s

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Транскрипт

### Segment 1 (00:00 - 05:00) []

the Google have actually brought us a really stunning update to how they're advancing Robotics and in this video I'll dive deep into one of the new models that they've embedded into humanoid robotics that really progresses the field so essentially in this video what you're going to see is how Gemini 2. 0 is integrated into Robotics and how this is progressing with the aponic platform and integrating robotics into the real world it's truly interesting because these robots are now at the stage where they're able to do a lot more than previous iterations I'm sure some of you may have seen Google's previous ones before but the update here is truly impressive with the ability of these robots to reason about the physical world we're bringing Gemini 2. 0's intelligence to general purpose robotic agents in the physical world to be helpful robots need to be interactive responding live to your actions and your voice they need to be dextrous to complete your most complex tasks and they need to be General to understand things in your 3D world and all of these capabilities need to work across different physical forms we're bringing this together in Gemini robotics our most advanced Vision language action model Gemini robotics is interactive can you put the bananas in the clear container notice how we move the objects and the model reacts and replans on the Fly can you put the grapes in the clear container our model's low latency means it can respond live to rapidly changing conditions and instructions this same model can generalize to all kinds of applications where you can collaborate with the robot live Gemini robotics is dextrous high dexterity tasks are some of the biggest challenges in robotics I can fold the orange square into an origami fox that sounds fun why don't we try that sure or did you know that the word origami comes from the Japanese words Ori meaning to fold and Kami meaning paper these capabilities are enabled by Gemini 2. 0's spatial understanding of detailed aspects of things in your world I can point to where the ey should be drawn on the fox most importantly Gemini robotics is General it uses Gemini 2. 0 World understanding to generalize across a vast range of real world tasks can you flip the red D so that it matches the number on the green D many robots can execute predefined actions but these movements are not predefined the robot is reasoning both about what it sees and how to move it figures out how to make the red D match just like we asked and this generalization goes even further this same model can generalize to tasks like this one that it's never been trained to do pick up the basketball and slam dunk it keep in mind these are objects the robot has never seen before but by leveraging Gemini 2. 0's understanding of Concepts like basketball and slam dunks the robot figures out the task we're now inviting more Partners to join our trusted testers program where we're working together to build the next generation of robotic AI agents now one of the key features of Gemini robotics was the fact that this can actually perform tasks without specific training which means that it can do it zero shot or with only a handful of demonstrations which is what they call fuse shot now this is really important because traditionally robotic systems usually require extensive task specific training Gemini's method reduces the amount of data needed for a robot to adapt significantly and this makes it so much easier and faster to teach robots new tasks and the crazy thing about this is that it can even generalize to tasks it has never seen before this is something that I do think is really important a lot of the times many critics of Robotics and AI do say that these models and systems aren't able to generalize outside of their training data and with this kind of progress this means that now robot won't have to be purely trained in a simulation that's exactly like their physical world in order to function effectively they'll simply be able to use the model that's embedded within them analyze the environment and make decisions quite like how humans make decisions on a day-to-day basis help me get organized let's start by putting the pen with the other pencils okay I will move

### Segment 2 (05:00 - 10:00) [5:00]

pick up the basketball and slam dunk it okay I will through the net good job now in the interactive update they talk about how this can respond to not only new environments but rapidly changing environments so this is where we have the robot actually allowing itself to understand and dynamically update where it's moving things as well as being able to do that as things move around in that environment so in this clip you can see this human is able to move different things around and the robot is able to analyze in real time where the objects are and then fulfill and complete its tasks I think that this is something that is once again really important because in the real world is constantly changing as you're crossing a street cars are moving by and sometimes in your environment things are moving around you maybe you have a pet maybe someone's moving the table or something and you will need to constantly be aware of your surroundings in order to be remarkably effective and seeing the robot do this in real time is something that is really impressive and one of the key things here that I think is even more impressive that most people may have not picked up on is that this is in onetime speed and is fully autonomous many times before with robotics demos we've often seen these demos at five times speed due to the slow nature of the policies but clearly Google have managed to do something Innovative when it comes to the efficiency because what we can see here is that this is relatively fast even for humans so I would say that this update is showing us just how quickly the pace of Robotics is going can you put the bananas in the clear container grapes in the pink container hey can you erase the Whiteboard for me now one of the most important things for robotics is the ability to have fine motor skills and coordination and in this demo we can see that Google Gemini is able to do this with a remarkable level of efficiency the robot is able to do very complex and very intricate tasks such as placing glasses into a holder such as also using the Apollo robot to place these pieces of the game set onto the board and also being able to fold paper in a really precise Manner and I think that this is really important because the scale of what you can do with the robotics platform is often limited by the hardware but if you can take very Basic Hardware like these two grippers in robotics and you're able to complete a wide variety of tasks it does mean that in the future when you do have something that has more degrees of freedom it's quite likely that you're going to be able to do a wider range of tasks leading to more applications and probably even more than humans are initially used to so I do think that this is going to be something that is really important because maybe in the future we may actually see robots do things with their hands that humans really just can't do and I think that is going to be a real shocker when that does occur in the future now something really interesting that I found to be probably one of the most important features of the Gemini platform is the fact that this can swiftly adapt to new robot platforms and it can be moved to humanoid robots by arm industrial robots and it can do this all with minimal data now the reason that this matters so much is that deploying robotic intelligence across different types of Hardware

### Segment 3 (10:00 - 11:00) [10:00]

platforms is often quite challenging and Gemini's approach shows impressive adaptability enabling the same model to generalize quickly to new robot shapes and capabilities and they've demonstrated that you know successfully adapting Gemini robots from a b manual robot to a humanoid robot with five-fingered hands and this allowed it to quickly execute intricate manipulation tasks so this is clearly something that across the board will advance the field of Robotics imagine you're able to put one unified model into a robot and it's immediately able to be used pretty much just like a software update and I think across the industry this week alone we've seen many different robots show us that their Hardware limitations aren't as big as we thought and with the continual progression of these internal models we're likely to see more generalization across the board now Gemini robotics also introduced Gemini robotics ER which is a VM a vision language model with an unprecedented ability to deeply understand the physical environment through enhanced embodied reasoning and this matters a lot because traditional robots mostly perform isolated tasks in pre-programmed settings and Gemini robotics ER allows robots to inherently reason about spatial Concepts object affordances for example where it wants to grasp 3D spatial relationships and trajectories intuitively much like humans do naturally so where this was proven is that to Gemini robotics ER demonstrates state-of-the-art performance in those benchmarks which we hadn't seen before and I think this is really important because it shows us that Google is at the frontier of advancing robotics so let me know what you thought about this update from Google I think this is one of the most underrated things I've seen so far I do think that the aponic platform is really cool and I can't wait to see more updates from Google as they've had pretty impressive updates in the past that have truly stunned the internet
