This AI Learn To Climb Crazy Terrains! 🤖
6:16

This AI Learn To Climb Crazy Terrains! 🤖

Two Minute Papers 19.02.2021 69 067 просмотров 5 188 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper ALLSTEPS: Curriculum-driven Learning of Stepping Stone skills"" is available here: - https://www.cs.ubc.ca/~van/papers/2020-allsteps/index.html - https://github.com/belinghy/SteppingStone Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord. If you drop by, make sure to write a short introduction if you feel like it! https://discordapp.com/invite/hbcTJu2 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Lau, Eric Martel, Gordon Child, Haris Husic, Jace O'Brien, Javier Bustamante, Joshua Goller, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background image credit: https://pixabay.com/images/id-1696507/ Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (6 сегментов)

<Untitled Chapter 1>

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. In 2017, scientists at OpenAI published a paper where virtual humans learned to tackle each other in a sumo competition of sorts, and found out how to rock a stable stance

Emergent behavior: stable stance

to block others from tackling them. This was a super interesting work because it involved self-play, or in other words

Quadruped agent morphology

copies of the same AI were playing against each other, and the question was, how do we pair them with each other to maximize their learning. They found something really remarkable when they asked the algorithm to defeat an older

Emergent behavior: blocking using legs

version of itself. If it can reliably pull that off, it will lead to a rapid and predictable learning process.

Emergent behavior: kicking

This kind of curriculum-driven learning can supercharge many different kinds of AIs. For instance, this robot from a later paper is essentially blind as it only has proprioceptive sensors, which means that the only thing that the robot senses is its own internal state and that’s it. No cameras, depth sensors, no LIDAR, nothing. And at first, it behaves as we would expect it…look, when we start out, the agent is very clumsy and can barely walk through a simple terrain… but as time passes, it grows to be a little more confident, and with that, the terrain also becomes more difficult over time in order to maximize learning. That is a great life lesson right there. So, how potent is this kind of curriculum in teaching the AI? Well, it learned a great deal in the simulation, and as scientists deployed it into the real world, just look at how well it traversed through this rocky mountain, stream, and not even this nightmarish snowy descent gave it too much trouble. This new technique proposes a similar curriculum-based approach where we would teach all kinds of virtual lifeforms to navigate on stepping stones. The examples include a virtual human, a bipedal robot called cassie, and…this sphere with toothpick legs too. The authors call it “monster”, so, you know what, monster it is. So, the fundamental question here is, how do we organize the stepping stones in this virtual environment to deliver the best teaching to this AI? We can freely choose the heights and orientations of the upcoming steps, and…of course, it is easier said than done. If the curriculum is too easy, no meaningful learning will take place, and if gets too difficult too quickly, well…then…in the better case, this happens…and in the worst case, whoops! This work proposes an adaptive curriculum that constantly measures how these agents perform, and creates challenges that progressively get harder, but in a way that they can be solved by the agents. It can even deal with cases where the AI already knows how to climb up and down, and even deal with longer steps. But that does not mean that we are done, because if we don’t build these spirals right, this

Example failure at limit of capabilities

happens. But, after learning 12 to 24 hours with this adaptive curriculum learning method, they become able to even run, deal with huge step height variations, high step tilt variations, and let’s see if they can pass the hardest exam…look at this mess, my goodness, lots of variation in every parameter. And…Yes! It works! And the key point is that the system is general enough that it can teach different body types to do the same. If there is one thing that you take home from this video, it shouldn’t be that it takes from 12 to 24 hours. It should be that the system is general. Normally, if we have a new body type, we need to write a new control algorithm, but in this case, whatever the body type is, we can use the same algorithm to teach it. Absolutely amazing. What a time to be alive! However, I know what you’re thinking. Why teach them to navigate just stepping stones? This is such a narrow application of locomotion, so why this task? Great question, and the answer is that the generality of this technique we just talked about also means that the stepping stone navigation truly was just a stepping stone, and here it is - we can deploy these agents to a continuous terrain and expect them to lean on their stepping stone chops to navigate well here too. Another great triumph for curriculum-based AI training environments. So what do you think? What would you use this technique for? Let me know in the comments, or if you wish to discuss similar topics with other Fellow Scholars in a warm and welcoming environment, make sure to join our Discord channel. The link is available in the video description. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник