# Unitree G1 - Moving the arms/hands - Dev w/ G1 Humanoid P.3

## Метаданные

- **Канал:** sentdex
- **YouTube:** https://www.youtube.com/watch?v=Uc1nhT8beTU
- **Дата:** 09.05.2025
- **Длительность:** 29:32
- **Просмотры:** 33,894

## Описание

Figuring out how to move the hands/arms in an abstract way in XYZ space rather than per-joint.

Unitree G1 series playlist: https://www.youtube.com/playlist?list=PLQVvvaa0QuDdNJ7QbjYeDaQd6g5vfR8km

Github for this project: https://github.com/Sentdex/unitree_g1_vibes

Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/ 
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex

## Содержание

### [0:00](https://www.youtube.com/watch?v=Uc1nhT8beTU) Intro

What is going on everybody? Welcome to another video with Jeff the G1. And today I've got something very exciting to show you guys. So, I've got my keyboard here and if I hit the up arrow, you can see the hand moves up. If I go down, the hand moves down. I also have the key presses logged on the guey, so I don't have to keep shoving the keyboard in your face. Uh, we can move the hand left. That's obviously relative to the robot, so it's the robot's left. We can go left again. We can go right again. Uh, we can go uh we can go forward. We can go backwards. I think you get the idea. Essentially, we have cartisian control uh of the hand such that we're just simply moving the hand around in cartisian space. But unfortunately, in the SDK, there is no exposure of that. So, you actually have to control all seven motors in the robot's arm and figure out a way to do that and control the hand. I understand to some of you um that looks really basic and boring and there are just some of you who cannot be pleased and I understand that. But what's cool here is so I'm thinking about um like how do you get the robot to actually do cool like do something like useful, right? So, I'm thinking um a $65,000 bottle of uh bubbly water from the kitchen is a good like starting task of, you know, can you get this robot to go to the kitchen, find the water, grab the water, and bring the water back to you. Um and so I started thinking about this. I'm like, "Okay, well, we've got the locomotion solved. We've got LAR and like walking around. We got the occupancy grid. Like, we can we're very close to going to the kitchen, getting that bottle of water. " But the problem is when you get there, okay, you see the water. You can definitely use like visual language models like I have no qualms about like, okay, we're going to be able to identify where is the kitchen, where is the water, all of that. But then when you locomote your way over to that water, I'm thinking I'm expecting the API to have highle control of the hands.

### [2:00](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=120s) Unitree SDK

So, um I'm going to let's just bring up the unit tree SDK real quick. So, if we go into the examples G1, um highle, you'll see there are highlevel arm controls. Now, silly me, I was assuming the highle control was going to indeed be like move the hand in cartisian space like xyz coordinates cuz that's basically what you would want. Like if you want to control the hands, right, and actually go grab a thing, you would really like to actually essentially move the wrist, but really the hand in cartisian space somehow and get to the object, grab it. like you're going to want to think of things in very abstract terms like up, down, left, right, forward, back. But um unfortunately when I found the highle uh SDK example here, I realized very quickly um it's actually just joint level control. Well, that's a pain in the butt because, you know, you it might sound still like a relatively trivial task to like operate all these motors, but there's a lot. Like this arm has seven different motors. Some spin, some rotate. They do different axes. And then if you rotate this one, the actual, you know, translation to cartisian coordinates, if you just spin this one, suddenly this joint does something entirely different. and you multiply that per motor. Like every motor down the line is impacted. And so the actual task of moving that hand in cartisian space is hard. It's just hard. And so um so then I realized this and I'm like, "Oh no. " I'm thinking, "Oh my gosh, I'm going to have to go to, you know, like a simulator or something like that and simulate the G1. " Um and that's going to be a nightmare. We're going to need to do reinforcement learning. we're going to have to figure out some sort of policy and reward function, all this stuff. And yikes, that's a nightmare. Um, essentially, reinforcement learning is like it you could think of it like it's as if it's unsupervised. And that's the that's the it sounds good, but it's very much supervised because you're just simply tuning a reward function until you get the outcome that you desired. And it feels very supervised. So, I didn't really want to do it. Plus, you have the whole problem of sim to real. Now, I do think likely the hand through space would work well in sim to real. I think that would be probably a successful endeavor, but I really didn't want to do it. And so then I start wondering, well, what about what if you like what if you just show the robot, right? Because like for example, if you have, you know, just the limp arm, for example, this one I can't really move because it's stiff, but um like what if you just show the robot? So you have these samples and then like you would say like okay well if you want to show the robot arm how to go up this is how it would go up right and then from here if you wanted to go left according to the robot you would go like this way right and like what if you just made a whole bunch of samples like that and that's what I tried did and that was the result which actually I found fascinating because it only took about 50 samples per action to and at least and that was like the minimum that I didn't even try to do less. So, it might even be doable with even less than that, but you start getting pretty good results at like 50 samples, which I thought that's crazy. So, um so let me show you and work through all the things I did. Also, uh I just I do have like a little bit of a

### [5:40](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=340s) GitHub

to-do list. So, I just want to make sure I share all the other changes that were made. So, first of all, I did um I did upload the files to GitHub. So, one of the other crazy things is I did not expect so many people to have and be interested in the G1, especially the edu. But one of the things that I'm like just blown away by is how many people have reached out to me saying that they have the G1. And then also of those people, how many people have these G1s, including the EDU that you like you programmed the EDU? Um, these people do not have programming experience. like they're not even programmers. And that's crazy to me. And we're talking people that have never even heard of these LLMs or like chat GBT like all that stuff. They don't even know what that is. Like that's crazy. The market for humanoids is um and the demand and thirst and interest for like humanoids is shocking to me. I mean, it's it's incredible. So, anyways, I did eventually put everything well, not everything, but many of the files are now up on GitHub. you can come here and peruse them if you'd like um or use them. Um I did put pointers to the uh third basically like all the SDKs. So that's for this the dev camera RGB. This is for the uh LAR. SDK. I am going to have to fork the SDK um and then probably point to my fork because I am slowly making some changes. I think everything on here will still work um but I've had to change some things. So anyways, that's up. I really just needed it up so I could link people to the files cuz people were asking for them and stuff. So I just wanted to put those up. So that's up. Um coming over here, uh we actually did fix this. So whenever the robot was actually walking through space like with LAR, it was detecting the little edges of the head. Um I fixed that. So basically it's just a uh like a filter basically. So like any pixels that are detected at about the level of the LAR unit and within like a few inches of the LAR unit, anything like that gets picked up, it's just ignored. It's just filtered out. Um because otherwise when he was like walking around, he was just like leaving this trail of uh you know pixels behind him. And that was a problem for the occupancy grid because it obviously was not accurate. On that note, also the occupancy grid was not in the previous videos accurate, but as you can see, it is now accurate to the LAR. And this is also accurate to reality. Um the problem was the occupancy grid. So the LAR is already flipped because the um the LAR or the slam I guess is already flipped because the LAR unit is flipped. It's mounted upside down. I actually got a comment. Someone asked me why did you mount it upside down. I didn't mount it. Like that's how it comes from unitry. That's how they manufactured it. It's also it's not a big deal if it's upside down. You just have to handle for it being upside down. That's it. Um, and that was not hard. So, anyways, it's upside down. And then for whatever reason, I've been using 03, as you likely know by now, uh, to do a lot of development. And for whatever reason, 03 flipped it again. And so, that was wrong with the occupancy grid. So, interesting. Okay. Uh, so that got fixed. Let's see what else we had. Uh, redundant flip. Yeah, the battery percentage. So, I really wanted to display the battery percentage, and I'm kind of surprised that this is not a thing in the um in the SDK. So I was expecting we could at least at minimum read a voltage of the battery and then from there get the you know calculate what's the percentage left on the battery. Um no that is not an option. So okay I'm pretty surprised. Uh the other thing is gate. So now it can actually walk faster. So everything you've seen him walking up to this point was like half speed. You can press and hold shift now and he can go a little quicker. Um I might put up a clip of that. Maybe not. I don't know. depends how interested I am at the point of editing. Uh, and then also the estop. People kept sending me videos of this uh, the H1 going crazy on somebody. I do have the remote now. And with the remote, you can even see I've labeled it with the nice red tape on the remote. If you press L1 and A, the robot will go into a damn state. I will show that now. Boom. Dead. He is esotopped. Um, so uh, so I did that uh, just to be safe. So now the remote is essentially always on whenever I'm doing anything with the robot. It has a longer battery life usually than the battery on the robot. So it's easy to keep that going. Um got the get uh uploaded. So everything's done there. Um I think that's everything. So okay. So those are besides the arm moving. Those are all the changes that have been made up to this point. And now what I want to do is walk you through step by step how we did or how well me when I say we it again it's always going to be o open AAI with 03. Um and then me kind of just guiding the way. Um so now what I'm going to do is I think it's going to be okay to just leave him damp. I did u I added all kinds of little helper things. So like on here I can actually damp arms and center waist. So part of the problem was so like with the highle arm uh arm control um one of my concerns was well if we're going to need to do joint by joint I was afraid that if we took over that are we going to lose the legs. So the threat was as soon as you try to take over a certain like topic I think is the right word to use for like the DDS side of things. As soon as you try to take over a topic, you'll now it's your topic. You have to control it. So I was very afraid to take over the control of the arms because I was very afraid what was going to happen was now I got to control the gate as well. That is not the case. So there is a special topic of arms and when you take that and you take that uh like through the SDK basically what it gives you is waist up. So then everything like below the waist is still controlled. So, actually, you can move the waist, arms, you can do all kinds of stuff, and the legs will like always keep you balanced, which is actually pretty cool. Um, it's pretty crazy. And, and it's like the whole thing could be damp. So, like the robot could be doing whatever you want it to do, or it could just be totally damp and the legs are just walking around. It's uh it's very interesting uh to see. So, um anyways, uh so that was cool. Um and then from there, I was trying to figure out like how do I start proving out like this idea that I have. So let me walk you through that process. Okay. So the first thing I did was I wanted to see what are the values from the actual ARMS. So Python 310 ARMD guey. I mean you can just take my word for it. That's why I typed in but here you go. And then that gives us this window. And on here we can see the Wow, that's weird. I don't know why it was jumping around. So really the very first thing I wanted to do was just test while it was on the hanger, can I adjust one of the joints without the robot just losing its mind. And uh so I just wanted to move like an elbow, but I wanted to double triple quadruple check that the elbow that I had the right joint essentially. And so I made this program to start well and again by I mean with 03. Um and then so you can see here obviously this is the right arm. So here is our elbow. If I move that, you should be able to see that is the one that's changing the most. Obviously, I'm moving the arm a little bit as I do this. So, like other values are changing. And then we also have the waist. If I move that, you'll see that changes. I ended up just locking the waist uh for the actual data and all that and control. I keep the waist just at zero. But you can address that as well and still maintain balance. So, that's kind of cool. And then I start thinking like I said like okay well if I have all these values like you can see on the right arm. So you'll like if you start here you could begin a sample. Let's say you want to get a sample of up. So you literally just make an array of all of those and then you just move the hand up and boom now you have another array. That sounds like a pretty relatively basic regress regression type task. And so then I started wondering like, well, can we just do that? Like, can I just show it that because I've done RL I've done enough of it to know I really don't want to. And if I don't have to. And so, uh, that was my question is can I just make a bunch of those samples and get away with it. But obviously, like I want you to understand you already know the end result. It worked. But what's so crazy to me that it worked is like you just saw like so let's say you start here and you go up. What do you mean? What do you mean by up? How far up are we talking? Like how much am I moving that hand? Sometimes there there's so much noise there, right? And then same thing like left and right. There's always like and maybe when you go left and right, maybe you just accidentally move the hand a little forward or a little back or whatever. And it's all um it's so imprecise, right, to do it manually. But the nice thing about doing it manually is it's actually super precise. It's perfect groundtruth data or at least um I don't know how to put it. Like it's messy data, but it's also accurate data. So, so I really wasn't sure like can I actually just do this cuz again it's very messy. Um, and it's not perfect like the left right control, but it's good enough such that if you are watching it with a camera and you see the object and you know you want to get your hand to that object, it's good enough such that you could continue figuring out how what XYZ change you need and keep issuing commands to get there. It's good enough to get it there. And that's what I needed. I needed to get to the point where I could open the hand, get the hand there, close the hand, grab the thing. So this is I I still can't believe it was as easy as it was. I thought it would take many many more samples, but again per action of up, down, left, right, forward, back, it took 50 samples and immediately I was seeing results, which is crazy.

### [15:39](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=939s) ArmTrain Recorder

So I did also uh I've got the actual like arm train recorder. So this is uh one example. Uh let me pull that up. Python 310 arm train recorder. py. Uh I think this one is I think it it'll default to I think it's gonna default to the left arm and that's all been like wasted but I'll just show you uh how it works. So basically you run this it will countd down from like 3 2 1 basically to get started cuz I you would run it and then usually there's like various samples you can set the arm. You can also set the number of samples that you wanted to do. I tended to just do 20 at a time. Um, and so you'd set that do the number of samples and then basically what it does is it will give you a verbal like it will say up, down, left, right, forward, backward, and then you just do that action and then it does another one and it's like every two seconds it just keeps asking it just keeps giving you different ones. So for example, you know, you would start the hand here. I'm going to run it. Uh, I think I have Hold on. I think it'll come over my mic good enough, but you'll take my have to take my word for it if you can't hear it good enough, but anyway. So, we'll run it. I think the default is like two or three. So, it counts down. Left. Left would be this way. Backward. And then we would go backward. I'm hitting the right. And then that was it for this one. And then it gives you a bunch of counters for the number. Um, again, when I actually was making the data, a few things. one um I actually stiffened the waist so that wouldn't happen where it would turn. I really did consider um letting the waist go because just like what you saw, it's actually beneficial like if you're trying to if your hand is here and you wanted to move backwards, it would be convenient to be able to rotate your body. Um and then if you want to reach even more forward, it's convenient to rotate your body. It's very natural to use your waist. Um, but in this case, I just when I started out, I just wasn't really sure like do I want to move the waist and add like one more element. Plus, if you move the waist, you have to like now you're dealing with the waist otherwise the robot could is like just get could get very lost very quick and so I ended up just locking the waist. But that that's how I collected the data. I mean, it was a little better cuz I wasn't talking as I was like trying to move the arm. But that's it. That's all I did for all the samples. basically 50 samples per action and it just randomly picks one from the list of these right um and then yeah about 50 per and boom you have the model can do all those things so anyways we gather the samples now of course that doesn't mean a model got trained so what happens so

### [18:18](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=1098s) Data

the samples they go into the data directory and then arms and for now I've separated them to left and right u most of my data is in the right so we can open that so this is all the data so far so basically Again, all that you get is the direction. So, this is the command essentially. Then you have the starting position for all of those motors that are or the joints basically. So, but the motors on the actual arm. So, this is the right arm. So, the right arm has obviously you've got the 12. This was the uh waist, but this we're not you actually using this in the uh training right now. And really, all of these are going to be more or less basically zero. the waist is being set to zero, but there's like a little bit of variance there. Um, but those I'm not using at the moment. Then we have all the other joints for the actual arm. So, we're just tracking the starting position of all of those joints and then the ending joints. That's it. That we're that's all we're doing. And then we're mapping them essentially to the direction um you know input. So then so we have all this data uh collected. And then once we've done

### [19:26](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=1166s) Train

that, we'll come over back to the data dur and then uh yeah train. We will open this. I'll just toss it in there. And not even on the GPU or anything. We're using scikitlearn for an very simple MLP. It's a uh two hidden layers uh 32 neurons each. Just a little tiny baby model. Um and the input again is just all of those joint positions and the command. So the target command all the joint positions and then the output is the uh target end result that we're hoping the model to get to. We run the regressor. Um it's lit literally uh I think it's rectified linear is the activation function for the hidden layers. Uh I'm trying to find it but I'm pretty sure it's relu I think it's atom optimizer and then yepu atom and then that like the actual final layer is just a linear output. Boom. Done. like simplest regressor uh mean squared error for the loss function. Um and that's it. And uh we'll come over here.

### [20:30](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=1230s) Training Samples

Uh we'll come into artifacts and then these are training samples. So this was the very first one. Um actually I don't even think this was probably not the first one cuz that's 550. So I mean there's like I don't even know six actions or something. But I just basically saved uh a few per number of actions that I happen to have. So at the moment we're up to all uh 1,700. the training curves. Uh this is the I think for the most recent one. Um, so we just saved this and again I just I can't stress enough all of this is um 03, right? We we vibed all of it, right? It was my idea to do the hey, I wonder like what if we just show it, but as far as the data collection, all the gooies and like even when this came up, I was like, "Yeah, can you like orient it this way? I want maybe the joint number, make it yellow so it's easier to see all that stuff. " Um, and then also I needed it to be I wanted the text to be able to be bigger if I made it bigger. Um, and that's really just for the video. Like that's it because I didn't need that, right? So like just it's so fast to just you want a UI for any reason. You just boom, done. It I would never have done this um in the past like if it was just me. And then uh the training stuff again I just had it like save this. So this is the training. So like essentially this is the train training run. We went for 60 epochs. Uh and then the test. So obviously the test is higher. That that's fine. That's kind of it's basically what you expect. Um and then at some point after epoch 60 we're overfitting. Um but I mean the training curves look pretty darn good to me. Uh and then I also was curious about um the different possible actions cuz I was very curious. Is there any action that is for whatever reason not fitting as easily? In this case it's back but depending on the sample at the time it could be any action. So and really just the mean squared error of these are all very successful like I mean it it's a great success. So okay so that's

### [22:29](https://www.youtube.com/watch?v=Uc1nhT8beTU&t=1349s) Inference

it. So we train okay then it came time to inference. Now I will show you I did I tried to be a good boy and I did set up the simulator because the idea was well if we've trained this it was relatively safe to make the data because the arm is essentially in damp mode. you know, it's not going to do anything. But then when you go to run the arm policy, you know, live, um, there's a little bit of risk. Okay. Um, one thing I found is like, for example, I use this joint script and I basically set the arm position. So, I needed to have a starting position for the arm, right? Um, before we start controlling it. So, I was like, uh, maybe about here. And I just simply I looked at what were the positions. I wrote them down and then I use those. And so then you start the robot the initialization like hanger boot sequence starts in damp mode. So the arms are to the side and then it issues the commands and the command is simply set the joint positions to you know what I had set which is essentially here. If your wrist was right here when you issued that command, maybe it won't break your wrist, maybe. But it's going to hurt really badly if you had your hand there. The rate at which this guy can move his hand um is pretty shocking. I think this dude, if he made a little fist, I think he could punch you and knock you out. I'm just saying. Uh he's very forceful. So if you are making your own ARM policy, it is essential that whether you're programming it or you're asking 03 to program it for you that any new position you say go there with a gradient. I want it to be smooth and slow. You can always go faster. Okay? You can always change it and get a little quicker, but um the default is unbelievably fast. Like I I'm not even going to show you because it is pretty crazy how quickly things can get out of hand. Um it violent is the right word. It really is. U these motors, you know, people were saying like, "Oh, that's such it's like a toy or it's a piece of crap. " What? No, no. This is a very high-end robot. I have seen some stuff. Okay, I've seen enough to know. No, those motors are nonsense. Like they're crazy. Um, I was having a conversation uh today with somebody that was asking about like, well, they seem like they're real slow, you know, like they don't walk very fast. Um, that's because that's the public that's the gate that they're shipping the robot with. I have no doubt that this G1 could sprint faster than me. Now, could it do could it go for as long as I could sprint? Probably not, right? Eventually, the motors are going to overheat. Um, and yeah, you maybe you break a tooth on something or whatever, but just capability-wise the it is pretty shocking. So, anyways, long story short, um, set a gradient. Okay, so I set that gradient. Um, but then if you're testing this model that you just trained, you have no idea what is it going to do. So, anyway, long story short, I did try to be a good boy and do the simulator. So, I will show you that. Uh, there is I am almost I'm just so positive there is a fix to this. Um, actually I think there's a command I need to run. Let me check my notes here. Yeah, there was. Uh, so we need to Oops. Okay, it all loaded. But essentially, you have to pass the URDF data. Um, where did I get those from? I got those the URDF for the G1. Like, as you can see, it looks pretty good. I mean, that's not terrible, right? Um, and then I did pass the policy such that I could hit the arrow keys and I can do forward and back. And that all works, but clearly the like I'm hitting up here. It's not going up. I'm hitting uh left. It's not going left. I'm hitting right. Well, that was sort of right. Uh, but I've seen enough to know I don't think it's working. Like it it's okay, but it looked really bad. And after seeing that, I was like, man, this can't be right. Like, it can't be that bad. Like, I just I didn't believe it. So, I loaded it up on the robot and that's what you saw, right? It does work on the robot. My guess is the joints are not lining up correctly or something, but I haven't tried to figure that out yet. But that's very annoying because now that means yeah if you do want to do RL or something like that and then like transfer between them um it it's going to be somewhat challenging because apparently potentially I need to look more into it but I think the joints are likely labeled differently. But we are now at the point where we can um we can walk around and we can control those arms at the same time. And then all we need to do we need to uh use figure out the hand SDK, but I'm pretty sure the hand SD I mean there's three fingers that curl. I don't think that's going to be a hard thing. That's like a easily hardcoded thing if there I haven't even looked at the high level SDK yet for the hands and before the in the hand SDK there is like arm movement. That would be hilarious but um but we still learned along the way. Um, and uh, after that I think in the next video we might actually go to our kitchen and get that like 65,000 to$120 plus thousand body bottle of water. Um, and show all the negative comments that no, this robot is actually useful. It can do stuff for you. It can do chores in your home. Um, it can be your nanny. It's great. So, uh, anyway, uh, I think that covers everything up to this point. I might still keep training or adding more data. Um, I might try to figure out what was the issue with the, uh, simulator, but um, I'm relatively happy with the hand controls. I think we are at the point where, like I said, I think we would still be able to issue enough commands to get where we need to go, even though it's not perfect. It really isn't. It's good enough. I pretty confident we could grab a can of water or a bottle or whatever. Um so, uh I think that I think I'm going to stop it here and potentially in the next video, I don't want to make a promise, but I think in the next video we could probably go get that can of water. Um if you have questions, comments, concerns, uh whatever, feel free to leave them below. Uh otherwise, yes, yet again, we will see you in another video.

---
*Источник: https://ekstraktznaniy.ru/video/11376*