# Learning Dexterity

## Метаданные

- **Канал:** OpenAI
- **YouTube:** https://www.youtube.com/watch?v=jwSbzNHGflM
- **Дата:** 30.07.2018
- **Длительность:** 3:29
- **Просмотры:** 336,259
- **Источник:** https://ekstraktznaniy.ru/video/11620

## Описание

We’ve trained a human-like robot hand to manipulate physical objects with unprecedented dexterity. Our system, called Dactyl, is trained entirely in simulation and transfers its knowledge to reality, adapting to real-world physics using techniques we’ve been working on for the past year. Dactyl learns from scratch using the same general-purpose reinforcement learning algorithm and code as OpenAI Five. Our results show that it’s possible to train agents in simulation and have them solve real-world tasks, without physically-accurate modeling of the world.

Learn more: https://blog.openai.com/learning-dexterity/


Directed by: Jonas Schneider
Starring: Alex Ray
Sound Supervisor: Larissa Schiavo
Production Design: Ben Barry
Production Manager: Diane Yoon
Stills Photographer: Eric Louis Haines
Sound Effects:  InspectorJ https://freesound.org/people/InspectorJ/sounds/411088/ CC-Attribution
Music:  "Bring Your Own" by Dexter Britain, "Intermediary" by Mattijs Muller

## Транскрипт

### Segment 1 (00:00 - 03:00) []

We're working on teaching robots to solve a wide variety of tasks without having to program them for any one specific task. Here a robot has learned to rotate a block into any orientation we'd like. Once it succeeds at that we give it a new goal and so on. The system runs on a human-like robot hand and we use reinforcement learning and simulation to teach the robot how to solve tasks in the real world. To learn how to solve a task like this, we show it many different variations of the world where the rules are slightly different every time. This is a technique called domain randomization and it affects for example the color of the cube and of the background. However, we take this technique beyond just how the environment looks. We also randomize aspects like how fast the hand can move, how heavy the block is, and the friction between the block and the hand. Our learning algorithm sees all of these different worlds and this lets it learn a way of manipulating the block that is very robust. Robust enough so that eventually we can accomplish the same task in the real world. To simulate all of these possible variants of the environment, we've built a system that runs the training processes in the cloud on thousands of machines. It's called Rapid and we've used the same system before to solve complex video games. First rollout workers collect experience from many different variations of the environment. They send this experience data to the optimizer which uses it to improve the parameters of the model controlling the robot. Finally updated parameters make their way back to the rollout workers to complete the cycle. One thing that's very interesting to us is how general the system is. Not only can it rotate blocks but it can perform tasks with other shapes as well. If you wanted to write a controller for this task the old-fashioned way you'd sit down and write out exactly if I'm in this position move this finger in this direction. If I'm in this other position move here and so on. It's very meticulous. Instead our system can learn to manipulate objects of all kinds of shapes without any additional human help. We hope that with this approach we can solve more and more complex tasks in the future so that we can go even further beyond what today's hand-programmed robots can do.