Deep Reinforcement Terrain Learning | Two Minute Papers #67

3:00

Deep Reinforcement Terrain Learning | Two Minute Papers #67

Two Minute Papers 19.05.2016 25 699 просмотров 719 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this piece of work, a combination of deep learning and reinforcement learning is presented which has proven to be useful in solving many extremely difficult tasks. Google DeepMind built a system that can play Atari games at a superhuman level using this technique that is also referred to as Deep Q-Learning. This time, it was used to teach digital creatures to walk and overcome challenging terrain arrangements. __________________________ The paper "Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning " is available here: http://www.cs.ubc.ca/~van/papers/2016-TOG-deepRL/index.html The implementation of the paper is also available here: https://github.com/xbpeng/DeepTerrainRL OpenAI's Gym project: https://gym.openai.com/ WE WOULD LIKE TO THANK OUR GENEROUS SUPPORTERS WHO MAKE TWO MINUTE PAPERS POSSIBLE: Sunil Kim. https://www.patreon.com/TwoMinutePapers Subscribe if you would like to see more of these! - http://www.youtube.com/subscription_center?add_user=keeroyz The thumbnail background image was created by Fulvio Spada - https://flic.kr/p/o7z8o1 Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook → https://www.facebook.com/TwoMinutePapers/ Twitter → https://twitter.com/karoly_zsolnai Web → https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 03:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This is a followup work to a technique we have talked about earlier. We have seen how different creatures learned to walk, and their movement patterns happened to be robust to slight variations in the terrain. In this work, we imagine these creatures as a collection of joints and links, typically around 20 links. Depending on what actions we choose for these individual body parts in time, we can construct movements such as walking, or leaping forward. However, this time, these creatures not only learn to walk, but they also monitor their surroundings and are also taught to cope with immense difficulties that arise from larger terrain differences. This means that they learn both on character features, like where the center of mass is and what the velocity of different body parts are, and terrain features, such as, what the displacement of the slope we're walking up on is or if there's a wall ahead of us. The used machinery to achieve this is deep reinforcement learning. It is therefore a combination of a deep neural network and a reinforcement learning algorithm. The neural network learns the correspondence between these states and output actions, and the reinforcement learner tries to guess which action will lead to a positive reward, which is typically measured as our progress on how far we got through the level. In this footage we can witness how a simple learning algorithm built from these two puzzle pieces can teach these creatures to modify their center of mass and adapt their movement to overcome more sophisticated obstacles, and, other kinds of advertisites. And please note that the technique still supports a variety of different creature setups. One important limitation of this technique is that it is restricted to 2D. This means that the characters can walk around not in a 3D world, but on a plane. A question whether we're shackled by the 2D-ness of the technique or if the results can be applied to 3D remains to be seen. I'd like to note that candidly discussing limitations is immensely important in research, and the most important thing is often not we can do at this moment, but the long-term potential of the technique, which, I think this work has in abundance. It's very clear that in this research area, enormous leaps are made year by year, and there's lots to be excited about. As more papers are published on this locomotion problem, the authors also discuss that it would be great to have a unified physics system and some error metrics so that we can measure these techniques against each other on equal footings. I feel that such a work would provide fertile grounds for more exploration in this area, and if I see more papers akin to this one, I'll be a happy man. Thanks for watching, and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник