AlphaZero: DeepMind's New Chess AI | Two Minute Papers #216

6:35

AlphaZero: DeepMind's New Chess AI | Two Minute Papers #216

Two Minute Papers 21.12.2017 87 767 просмотров 2 721 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

The paper "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" is available here: https://arxiv.org/pdf/1712.01815.pdf Our Patreon page with the details: https://www.patreon.com/TwoMinutePapers One-time payments: PayPal: https://www.paypal.me/TwoMinutePapers Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7WkrFAHh Ethereum: 0x002BB163DfE89B7aD0712846F1a1E53ba6136b5A Recommendations: https://www.youtube.com/watch?v=akgalUq5vew https://www.youtube.com/watch?v=0g9SlVdv1PY https://www.youtube.com/watch?v=Ud8F-cNsa-k https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match http://forum.computerschach.de/cgi-bin/mwf/topic_show.pl?tid=9653 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Andrew Melnychuk, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dave Rushton-Smith, Dennis Abts, Emmanuel, Eric Haddad, Esa Turkulainen, Evan Breznyik, Frank Goertzen, Kaben Gabriel Nanlohy, Malek Cellier, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Michael Orenstein, Raul Araújo da Silva, Robin Graham, Steef, Steve Messina, Sunil Kim, Torsten Reil. https://www.patreon.com/TwoMinutePapers Credits: Elo ratings: https://ratings.fide.com/top.phtml?list=men Magnus image source: https://www.youtube.com/watch?v=eLaOeXCAPbU 400 point difference rule: https://www.fide.com/fide/handbook.html?id=172&view=article ctrl+f 400 One chess match source: https://chess24.com/en/watch/live-tournaments/alphazero-vs-stockfish/1/1/1 Stockfish: https://stockfishchess.org/ Thumbnail background image credit: https://pixabay.com/photo-1483735/ Music: Antarctica by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) Artist: http://audionautix.com/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (3 сегментов)

<Untitled Chapter 1>

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. After defeating pretty much every highly ranked professional player in the game of Go, Google DeepMind now ventured into the realm of Chess. They recently challenged not the best humans, no-no-no, that was long ago. They challenged Stockfish, the best computer chess engine in existence in quite possibly the most exciting chess-related event since Kasparov's matches against Deep Blue. I will note that I was told by DeepMind that this is the preliminary version of the paper, so now we shall have an initial look, and perhaps make a part 2 video with the newer

What is Alpha 0?

results when the final paper drops. AlphaZero is based on a neural network and reinforcement learning and is trained entirely through self-play after being given the rules of the game. It is not to be confused with AlphaGo Zero that played Go. It is also noted that this is not simply AlphaGo Zero applied to chess. This is a new variant of the algorithm. The differences include: - one, the rules of chess are asymmetric, for instance pawns only move forward, castling is different on kingside and queenside, and this means that neural network-based techniques are less effective at it. - two, the algorithm not only has to predict a binary win or loss probability when given a move, but draws are also a possibility and that is to be taken into consideration. Sometimes a draw is the best we can do, actually. There are many more changes to the previous incarnation of the algorithm, please make sure to have a look at the paper for details. Before we start with the results and more details, a word on Elo ratings for perspective. The Elo rating is a number that measures the relative skill level of a player. Currently, the human player with the highest Elo rating, Magnus Carlssen is hovering around 2800. This man played chess blindfolded against 10 opponents simultaneously in Vienna a couple years ago and won most of these games. That's how good he is. And Stockfish is one of the best current chess engines, with Elo rating over 3300. A difference of 500 Elo points means that if it were to play against Magnus Carlssen, it would be expected to win at least 95 games out of a 100. Though it is noted that there is a rule suggesting a hard cutoff at around a 400 point difference. The two algorithms then played each other. AlphaZero versus Stockfish. They were both given 60 seconds of thinking time per move, which is considered to be plenty

Is Alpha Zero better than stockfish?

given that both of the algorithms take around 10 seconds at most per move. And here are the results. AlphaZero was able to outperform Stockfish in about 4 hours of learning from scratch. They played a 100 games - AlphaZero won 28 times, drew 72 times and never lost to Stockfish. Holy mother of papers, do you hear that? Stockfish is already unfathomably powerful compared to even the best human prodigies, and AlphaZero basically crushed it after four hours of self-play. And, it was run with a similar hardware as AlphaGo Zero, one machine with 4 Tensor Processing Units. This is hardly commodity hardware, but given the trajectory of the improvements we've seen lately, it might very well be in a couple of years. Note that Stockfish does not use machine learning and is a handcrafted algorithm. People like to refer to computer opponents in computer games as AI, but it is not doing any sort of learning. So, you know what the best part is? AlphaZero is a much more general algorithm that can also play Shogi on an extremely high level, which is also referred to as Japanese chess. And this is one of the most interesting points - AlphaZero would be highly useful even it if were slightly weaker than Stockfish, because it is built on more general learning algorithms that can be reused for other tasks without investing significant human effort. But in fact, it is more general, and it also crushes Stockfish. With every paper from DeepMind, the algorithm becomes better AND more general. I can tell you, this is very, very rarely the case. Total insanity. Two more interesting tidbits about the paper: one, all the domain knowledge the algorithm is given is stated precisely for clarity. two, one might think that as computers and processing power increases over time, all we have to do is add more brute force to the algorithm and just evaluate more positions. If you think this is the case, have a look at this - it is noted that AlphaZero was able to reliably defeat Stockfish WHILE evaluating ten times fewer positions per second. Maybe we could call this the AI equivalent of intuition, in other words, being able to identify a small number of promising moves and focusing on them. Chills run down my spine as I read this paper. Being a researcher is the best job in the world. And we are even being paid for this. Unreal. This is a hot paper, there is lot of discussions out there on this, lots of chess experts analyze and try to make sense of the games. I had a ton of fun reading and watching through some of these, as always, Two Minute Papers encourages you to explore and read more, and the video description is ample in useful materials. You will find videos with some really cool analysis from Grandmaster Daniel King, International Chess Master Daniel Rensch, and the YouTube channel ChessNetwork. All quality materials. And, if you have enjoyed this episode and you think that 8 of these videos a month is worth a few dollars, please throw a coin our way on Patreon, or, if you favor cryptocurrencies instead, you can throw Bitcoin or Ethereum our way. You support has been amazing as always and thanks so much for keeping with us through thick and thin, even in times when weird Patreon decisions happen. Luckily, this last one has been reverted. I am honored to have supporters like you Fellow Scholars. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник