Reinforcement Learning with OpenAI's Gym | Two Minute Papers #72
3:30

Reinforcement Learning with OpenAI's Gym | Two Minute Papers #72

Two Minute Papers 12.06.2016 21 253 просмотров 429 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
OpenAI's Gym is available here: https://gym.openai.com/ OpenAI - Non-profit AI company by Elon Musk and Sam Altman https://www.youtube.com/watch?v=AbcRlDBnwjM Google DeepMind's paper "Unifying Count-Based Exploration and Intrinsic Motivation" and video on reniforcement learning and curiosity: https://arxiv.org/pdf/1606.01868v1.pdf https://www.youtube.com/watch?v=0yI2wJ6F8r0 Link to the mentioned research project at Experiment: 1. https://experiment.com/projects/opening-your-mind-s-eye-collaborating-with-a-computer-to-reveal-visual-imagination?s=discover 2. https://experiment.com/projects/yvgjmnuxsnavvjuhxzwf WE WOULD LIKE TO THANK OUR GENEROUS PATREON SUPPORTERS WHO MAKE TWO MINUTE PAPERS POSSIBLE: David Jaenisch, Sunil Kim, Julian Josephs. https://www.patreon.com/TwoMinutePapers We also thank Experiment for sponsoring our series. - https://experiment.com/ Subscribe if you would like to see more of these! - http://www.youtube.com/subscription_center?add_user=keeroyz The thumbnail image is licensed under CC0 and is available here: https://pixabay.com/en/dumbbell-training-fitness-room-940375/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook → https://www.facebook.com/TwoMinutePapers/ Twitter → https://twitter.com/karoly_zsolnai Web → https://cg.tuwien.ac.at/~zsolnai/

Оглавление (2 сегментов)

<Untitled Chapter 1>

Dear fellow scholars This is two minut papers with KRO zsolnay fehér

What is meant by reinforcement learning?

reinforcement Learning is a Technique in the Field of machine Learning to learn How to navigate in a Labyrinth Play a video game or to teach a Digital Creature to walk usually We are interested in a series of actions That are in some Sense optimal in a given environment despite The fact that many enormous tomes exist to discuss the mathematical details The intuition behind the algorithm is incredibly Simple choose an Action and if you get rewarded for it Keep doing it if the Rewards are not coming Try Something Else The reward can be for instance Our score in a computer game or How far Our Digital Creature could walk it is usually quite difficult to learn things where the reward Comes Long after Our Action Because We don't know when exactly the point was When We did Something well This is one of the Reasons Why Google Deep Mind will try to Conquer strategy games future is a genre goods usually include Long planning reinforcement Learning techniques Don't really Excel It By the way this just in they have just published an Excellent paper on including Curiosity in this equation In A Way that helps long-term planning remarka as More techniques pop up in this Direction It is Getting abundantly Clear That We Need A framework where can undergo stringent testing means am of Rewards and scores Should Be computed The Same Way and in the same Physical framework open Ai is a nonprofit company boosting an impressive roster of top te researchers Who embark On The Quest to develop open and ethical Artificial Intelligence techniques we've Had A previous episode on this When The Company was freshly founded and as you Might have Guessed The link is available In The description Box they have recently published Their First Major project Go name gy gym is a unified framework that Puts reinforcement Learning techniques on an equal footing anyone can submit Their Solutions Which are run on the same problems and as a nice Bit of gamification leaderboards are established to see Which Technique emerges Victorious These environments Range from a variety of computer games to Different balancing tasks some simpler reference Solutions are also provided for many of them as a starting Point is like Disney World for someone Who is excited about the Field of reinforcement Learning with more and more techniques this sub Field gets More saturated It Gets more and more difficult to be the first at Something That's A Great challenge for researchers from a Consumer Point of View This Means that Better techniques will pop up day by Day and As I like to say quite Often we have really exciting Times Ahead Of Us A Quick sh to experiment a to Research Projects come to fruition by crowdsourcing Them current experiments include really cool Projects like How We Could implement Better antidoping policies for professional Sports or How to show on a computer screen How Our Visual Imagination Works Thanks for watching And For Your generous support and I see you next time y

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник