OpenAI - Learning Dexterous In-Hand Manipulation

5:13

OpenAI - Learning Dexterous In-Hand Manipulation

Two Minute Papers 12.02.2019 32 611 просмотров 1 517 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Check out "Superintelligence: Paths, Dangers, Strategies" on Audible: US: https://amzn.to/2RXr32F EU: https://amzn.to/2SqauwI The paper "Learning Dexterous In-Hand Manipulation" is available here: https://blog.openai.com/learning-dexterity/ https://arxiv.org/abs/1808.00177 Pick up cool perks on our Patreon page: › https://www.patreon.com/TwoMinutePapers We would like to thank our generous Patreon supporters who make Two Minute Papers possible: 313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Jason Rollins, Javier Bustamante, John De Witt, Kaiesh Vohra, Kjartan Olason, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty. https://www.patreon.com/TwoMinutePapers Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (6 сегментов)

<Untitled Chapter 1>

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This work is about OpenAI’s new technique that teaches a robot arm to dexterously manipulate a block to a target state. And in this project, they did one of my favorite things, which is, first, training an AI within a simulation, and then, deploying it into the real world. And in the best case scenario, this knowledge from the simulation will actually generalize to the real world. However, while we are in the simulation, we can break free from the limitations of worldly things, such as hardware, movement speed, or, even time itself. So how is that possible? The limitation on the number of experiments we can run in a simulation is bounded by not our time, which is scarce, but how powerful our hardware is, which is abundant as it is

GOAL 49

accelerating at nearly exponential pace. And, this is the reason why OpenAI’s and DeepMind’s AI was able to train for 200 years worth of games before first playing a human pro player. This sounds great, but a simulation is always more crude than the real world, so do we know for sure that we created something that will indeed be useful in the real world, and not just in the simulation? Let’s try an analogy. Think of the machine as a student, and the simulation would be its textbook that it learns from. If the textbook contains only a few trivial problems to learn from, when the day of the exam comes, if the exam is any good, the student will fail.

GOAL 3

The exam is the equivalent of deploying the machine into the real world, and apparently, the real world is a damn good exam. So how can we prepare a student to do well on this exam? Well, we have to provide them with a textbook that contains not only a lot of problems, but also a diverse set of challenges as well. This is what machine learning researchers call domain randomization. This means that we teach the AI program in different virtual worlds, and in each one of them, we change parameters like how fast the hand is, what color and weight the cube is, and more. This is a proper textbook, which means that after this kind of training, this AI can deal with new and unexpected situations. The knowledge that it has obtained is so general that we can change even the geometry of the target object and the machine will still be able to manipulate it correctly. Outstanding. To implement this idea, scientists at OpenAI trained not one agent, but a selection of agents in these randomized environments. The first main component of this system is a pose estimator. This module looks at the cube from three angles and predicts the position and orientation of the block, and is implemented through a convolutional neural network. The advantage of this is that as we can generate a near-infinite amount of training data ourselves. You can see here that when the AI looks at real images, it is only a few degrees worse than in the simulation when estimating angles, which is the case of the excellent textbook. I would not be surprised if this accuracy exceeds the capabilities of an ordinary human, given that it can perform this many times within a second. Then, the next part is choosing what the next action should be. Of course, we seek to rotate this cube in a way that brings us closer to our objective. This is done by a reinforcement learning technique, which uses similar modules as OpenAI’s previous algorithm that learned to play DOTA2 really well. Another testament to how general these learning algorithms are. I also recommend checking out OpenAI’s video on this work in the video description. Now, I always read in the comments here on Youtube that many of you are longing for more. 5 minute papers, 10 minute papers, 2 hour papers were among the requests I heard from you before. And of course, I am also longing for more as I have quite a few questions that keep me up at night. Is it possible for us to ever come up with a superintelligent AI? If yes, how? What types of these AIs could exist? Should we be worried? If you are also looking for some answers, we are now trying out a sponsorship with Audible, and I have a great recommendation for you, which is none other than the book Superintelligence by Nick Bostrom. It addresses all of these questions really well, and if you sign up under the link below in the video description, you will get this book free of charge.

GOAL 7

Whenever you have to do some work around the house, commute to school or work, just pop in a pair of headphones and listen for free.

GOAL 9

Some more AI for you while doing something tedious. That’s as good as it gets. If you feel that the start of the book is a little slow for you, make sure to jump to

GOAL 11

the chapter by the name “Is the default outcome doom”. But buckle up, because there is going to be fireworks from that point in the book. We thank Audible for supporting this video, and send a big thank you for all of you who sign up and support the series. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник