# Meet Your Virtual AI Stuntman! 💪🤖

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=eksOgX3vacs
- **Дата:** 25.05.2021
- **Длительность:** 7:15
- **Просмотры:** 434,079

## Описание

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers

📝 The paper "DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills" is available here:
https://xbpeng.github.io/projects/DeepMimic/index.html

❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: 
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Haro, Alex Serban, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Haris Husic,  Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2

Thumbnail tree image credit: https://pixabay.com/images/id-576847/

Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=eksOgX3vacs) Intro

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we are going to look at a paper from three years ago, and not any kind of paper, but my kind of paper, which is in the intersection of machine learning, computer graphics, and physics simulations. This work zooms in on reproducing reference motions, but, with a twist, and adds lots of amazing additional features.

### [0:28](https://www.youtube.com/watch?v=eksOgX3vacs&t=28s) Overview

So what does all this mean? You see, we are given this virtual character, a reference motion that we wish to teach it, and here, additionally, we are given a task that needs to be done. So, when the reference motion is specified, we place our AI into a physics simulation where it tries to reproduce these motions. That is a good thing because if it would try to learn to run by itself alone, it would look something like this. And if we ask it to mimic the reference motion, oh yes…much better. Now that we have built up confidence in this technique, let’s think bigger, and perform

### [1:06](https://www.youtube.com/watch?v=eksOgX3vacs&t=66s) Humanoid: Backflip

a backflip.

### [1:11](https://www.youtube.com/watch?v=eksOgX3vacs&t=71s) No Reference State Initialization

Uh-oh. Well, that didn’t quite work. Why is that? We just established that we can give it a reference motion and it can learn it by itself. Well, this chap failed to learn a backflip because it explored many motions during training, most of which resulted in failure. So it didn’t find a good solution and settled for a mediocre solution instead. A proposed technique by the name Reference State Initialization, RSI in short remedies this issue by letting the agent explore better during the training phase. Got it, so we add this RSI, and now, all is well, right? Let’s see. Ouch. Not so much! It appears to fall on the ground and tries to continue the motion from there. A+ for effort, little AI, but unfortunately that’s not what we are looking for. So what is the issue here? The issue is that the agent has hit the ground, and after that, it still tries to score some additional points by continuing to mimic the reference motion. Again, A+ for effort, but this should not give the agent additional scores. This method we just described is called early termination. Let’s try it! Now, we add the early termination and RSI together, and let’s see if this will do the trick! …And…yes! Finally, with these two additions, it can now perform that sweet backflip, rolls

### [2:48](https://www.youtube.com/watch?v=eksOgX3vacs&t=168s) Humanoid: Cartwheel

and much, much more with flying colors. So now, the agent has the basics down, and can even perform explosive, dynamic motions

### [2:56](https://www.youtube.com/watch?v=eksOgX3vacs&t=176s) Humanoid: Sideflip

as well. So, it is time. Now hold on to your papers as now comes the coolest part - we can perform different kinds

### [3:09](https://www.youtube.com/watch?v=eksOgX3vacs&t=189s) Humanoid: Dance A

of retargeting as well. What is that? Well, one kind is retargeting the environment.

### [3:14](https://www.youtube.com/watch?v=eksOgX3vacs&t=194s) Humanoid: Getup Face-Up

This means that we can teach the AI a landing motion in an idealized case, and then, ask

### [3:18](https://www.youtube.com/watch?v=eksOgX3vacs&t=198s) Environment Retargeting

it to perform the same, but now, off of a tall ledge. Or, we can teach it to run, and then drop it into computer game level and see if it

### [3:29](https://www.youtube.com/watch?v=eksOgX3vacs&t=209s) Humanoid: Run - Mixed Obstacles

performs well. And it really does. Amazing! This part is very important because in any reasonable industry use, these characters have to perform in a variety of environments that are different from the training environment. Two is retargeting not the environment, but the body type.

### [3:51](https://www.youtube.com/watch?v=eksOgX3vacs&t=231s) Character Retargeting

We can have different types of characters learn the same motions. This is pretty nice for the Atlas robot, which has a drastically different weight distribution

### [4:04](https://www.youtube.com/watch?v=eksOgX3vacs&t=244s) Atlas: Backflip

and you can also see that the technique is robust against perturbations. Yes, this means one of the favorite pastimes of a computer graphics researcher, which is throwing boxes at virtual characters and seeing how well it can take it. Might as well make sure of the fact that in a simulated world, we make up all the rules! This one is doing really well, … oh. Note that the Atlas robot is indeed different than the previous model, and these motions

### [4:32](https://www.youtube.com/watch?v=eksOgX3vacs&t=272s) Atlas: Spinkick

can be retargeted to it, however, this is also a humanoid. Can we ask for non-humanoids as well perhaps? Oh yes!

### [4:46](https://www.youtube.com/watch?v=eksOgX3vacs&t=286s) T-Rex: Walk

This technique supports retargeting to T-Rexes, dragons, lions, you name it.

### [4:51](https://www.youtube.com/watch?v=eksOgX3vacs&t=291s) Dragon: Walk

It can even get used to the gravity of different virtual planets that we dream up.

### [4:55](https://www.youtube.com/watch?v=eksOgX3vacs&t=295s) Physics Retargeting

Bravo! So the value proposition of this paper is just completely out of this world.

### [5:01](https://www.youtube.com/watch?v=eksOgX3vacs&t=301s) Cartwheel Moon Gravity

Reference State Initialization, Early Termination, retargeting to different body types, environments, oh my! To have digital applications, like computer games use this would already be amazing, and

### [5:16](https://www.youtube.com/watch?v=eksOgX3vacs&t=316s) Lion: Run

just imagine what we could do if we could deploy these to real-world robots. And don’t forget, these research works just keep on improving every year. The First Law Of Papers says that research is a process. Do not look at where we are, will be two more papers down the line. Now, fortunately, we can do that right now!

### [5:41](https://www.youtube.com/watch?v=eksOgX3vacs&t=341s) Humanoid: Kick

Why is that? It is because this paper is from 2018, which means that followup papers already exist.

### [5:45](https://www.youtube.com/watch?v=eksOgX3vacs&t=345s) Humanoid: Vault 1-Handed

What’s more, we even discussed one that teaches these agents to not only reproduce these reference motions, but to do those with style. And style there meant that the agent is allowed to make creative deviations from the reference motion, thus, developing its own way of doing it. An amazing improvement. And I wonder what researchers will come up with in the near future? If you have some ideas, let me know in the comments below. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13904*