These Are The 7 Capabilities Every AI Should Have

4:01

These Are The 7 Capabilities Every AI Should Have

Two Minute Papers 28.09.2019 50 312 просмотров 2 512 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Thank you so much for your support on Patreon: https://www.patreon.com/TwoMinutePapers 📝 The paper "Behaviour Suite for Reinforcement Learning" is available here: https://arxiv.org/abs/1908.03568 https://github.com/deepmind/bsuite 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Matthias Jost,, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil. https://www.patreon.com/TwoMinutePapers Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 04:00)

dear fellow scholars this is two minute papers with károly if I here a few years ago scientists a deep mind published a learning algorithm that they called deep reinforcement learning which quickly took the world by storm this technique is a combination of a neural network that processes the visual data that we see on the screen and a reinforcement learner that comes up with the gameplay related decisions which proved to be able to reach superhuman performance on computer games like Atari breakout this paper not only sparked quite a bit of mainstream media interest but also provided fertile grounds for new follow-up research works to emerge for instance one of these follow-up papers infused these agents with a very human-like quality curiosity further improving many aspects of the original learning method however had a disadvantage I kid you not it got addicted to the TV and kept staring at it forever this was perhaps a little too human-like in any case you may rest assured that this shortcoming has been remedied since and every follow-up paper recorded their scores on a set of Atari games measuring and comparing is an important part of research and is absolutely necessary so we can compare new learning methods more objectively it's like recording your time for the Olympics at the 100-meter dash in that case it is quite easy to decide which athlete is the best however this is not so easy in a research in this paper scientists a deep mind note that just recording the scores doesn't give us enough information anymore there's so much more to reinforcement learning algorithms than just scores so they built a behavior suite that also evaluates the seven core capabilities of reinforcement learning algorithms among these seven core capabilities the list generalization which tells us how well the agent is expected to do in previously unseen environments how good it is at credit assignment which is a prominent problem in reinforcement learning credit assignment is very tricky to solve because for instance when we play a strategy game we need to make a long sequence of strategic decisions and in the end if we lose an hour later we have to figure out which one of these many decisions led to our loss measuring this as one of the core capabilities was in my opinion a great design decision here how well the algorithm scales to larger problems also gets a spot as one of these core capabilities I hope this testing suite will see widespread adoption in reinforcement learning research and what I am really looking forward to is seeing these radar plots for newer algorithms which will quickly reveal whether we have a new method that takes a different trade-off than previous methods or in other words has the same area within the polygon but with a different shape or in the case of a real breakthrough the area of these polygons will start to increase luckily a few of these charts are already available in the paper and they give us so much information about these methods I could stare at them all day long and I cannot wait to see some newer methods appear here now note that there is a lot more to this paper if you have a look at it in the video description you will also find the experiments that are part of this suite what makes a good environment to test these agents in and that they plan to form a committee of prominent researchers to periodically review it I love that part if you enjoyed this video please consider supporting us on patreon if you do we can offer you early access to these videos so you can watch them before anyone else or you can also get your name immortalized in the video description just click the link in the description if you wish to chip in thanks for watching and for your generous support and I'll see you next time

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник