Terrain Traversal with Reinforcement Learning | Two Minute Papers #26

2:38

Terrain Traversal with Reinforcement Learning | Two Minute Papers #26

Two Minute Papers 18.11.2015 10 493 просмотров 238 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Reinforcement learning is a technique that can learn how to play computer games, or any kind of activity that requires a sequence of actions. In this case, we would like a digital dog to run, and leap over and onto obstacles by choosing the optimal next action. It is quite difficult as there are a lot of body parts to control in harmony. And what is really amazing is that if it has learned everything properly, it will come up with exactly the same movements as we'd expect animals to do in real life! In this technique, dogs were used to demonstrate that reinforcement learning works well in this context, but it's worth noting that it also works with bipeds. _____________________________ The paper "Dynamic Terrain Traversal Skills Using Reinforcement Learning " is available here: http://www.cs.ubc.ca/~van/papers/2015-TOG-terrainRL/ Recommended for you: Digital Creatures Learn To Walk - https://www.youtube.com/watch?v=kQ2bqz3HPJE Subscribe if you would like to see more of these! - http://www.youtube.com/subscription_center?add_user=keeroyz Thumbnail image by localpups (CC BY 2.0). It was slightly edited (flipped, color adjustments, content aware filling) - https://flic.kr/p/wXfFt1 Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Patreon → https://www.patreon.com/TwoMinutePapers Facebook → https://www.facebook.com/TwoMinutePapers/ Twitter → https://twitter.com/karoly_zsolnai Web → https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 02:00)

dear fellow Scholars this is two-minute papers with Caro here reinforcement learning is a technique that can learn how to play computer games or any kind of activity that requires a sequence of actions we are not interested in figuring out what we see on an image because the answer is one thing we are always interested in a sequence of actions the input for reinforcement learning is a state that describes where we are and how the world looks around us and the algorithm outputs optimal next action to take in this case we would like a digital dog to run and leap over and onto obstacles by choosing the optimal next action it is quite difficult as there are a lot of body parts to control in harmony the algorithm has to be able to decide how to control leg forces spine curvature angles for the shoulder elbow hip and knees and what is really amazing is that if it has learned everything properly it will come up with exactly the same movements as we'd expect animals to do in real life so this is how reinforcement learning works if you do well you get a reward and if you don't you get some kind of punishment these rewards and punishments are usually encoded in a score if your score is increasing you know you've done something right and you try to self-reflect and analyze the last few actions to find out which of them were responsible for this positive change the score would be for instance how far the dog could run on the map without falling and at the same time it also makes sense to minimize the amount of effort to make it happen so reinforcement learning in a nutshell it is very similar to how a real world animal or even a human would learn if you're not doing well try something new and if you're succeeding remember what you did that led to your success and keep doing that in this technique dogs were used to demonstrate the concept but it's worth noting that it also works with bads reinforcement learning is typically used in many control situations that are extremely difficult to solve otherwise like controlling a quadrocopter properly it's quite delightful to see such a cool work especially given that there are not so many uses of reinforcement learning in computer Graphics yet I wonder why that is it that not so many graphical tasks require a sequence of actions or maybe we just need to shift our mindset and get used to the idea of formalizing problems in a different way so we can use such powerful techniques to solve them it is definitely worth the effort thanks for watching and for your generous support and I'll see you next time

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник