❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers
📝 The paper "A Generalist Agent (DeepMind Gato)" is available here:
https://www.deepmind.com/publications/a-generalist-agent
❤️ Watch these videos in early access on our Patreon page or join us here on YouTube:
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Ivo Galic, Jace O'Brien, Jack Lukic, Javier Bustamante, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers
Thumbnail image: OpenAI DALL-E 2
Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we are going to have a look at DeepMind’s new AI, GATO, which can do almost everything at the same time. Playing games, controlling a robot arm, answering questions, labeling pictures, you name it. And the key is that all this is done by one AI. Now, in the last few years, DeepMind has built AI programs that solved a number of extremely difficult problems. Their chess AI is absolutely incredible, as it plays better than any human. Then, they proceeded to tackle Go, a game with an even larger space of possible moves with great success as they beat the reigning world champion. But, all of these are separate programs. They share some components, but all of them require a significant amount of engineering to tailor to the problem at hand. For instance, their amazing AlphaFold AI contains a lot of extra components that give it information about protein folding. Hence, it cannot be reused for other tasks as is. Their StarCraft 2 AI is also at the very least on the level of human grandmasters. This is also a separate AI. Now, of course, DeepMind did not spend years of hard work to build an AI that could play just video games. So, why do all this? Well, they use video games as an excellent testbed for something even greater. Their goal is to write a general AI that is the one true algorithm that can do it all. In their mission statement, they often say, step number one is to solve general intelligence, and step number two, use it to solve everything else. And now, hold on to your papers, and I cannot believe that I am saying this, but here is their newest AI that takes a solid step in this direction. So, what can it do? Well, for instance, it can label images. And these are not some easy images, I like how it doesn’t just say that here are some children and slices of pizza. It says that we have a group of children who are eating pizza. Now, the captions are not perfect, we will get back to exactly how good this AI is in a moment. Okay, so, what else? It can also chat with us. We can ask it to recommend books, ask it why a particular recommended book is interesting, ask it about black holes, protein folding, you name it. It does really well, but note that it can still be factually incorrect, even on simpler questions. And now, with this one, we are getting there - It can also control this robot hand. It was shown how to stack the red block onto the blue one, but now, we ask it to put the blue one on the green block. This is something that is hasn’t been shown before, so, can it do it? Oh yes. Loving it. Great job, little robot! So these were three examples. And now, we have to skip forward a little bit. Why is that? Well, we have to skip because I cannot go through everything that it can do. And now, if you have been holding on to your papers so far, now, squeeze that paper, because it can perform more than 600 tasks. 600! My goodness. How is that even possible? Well, the key here is that this is a neural network where we can dump in all kinds of data at the same time. Look. We can train it on novels, images and questions about these images, Atari game videos and controller actions, and even all this cool robot arm data. And, look! The colors show that all this data can be used to train their system, this is truly just one AI that can perform all of these tasks. Okay, and now comes the most important question. How good is it at these tasks? And this is where I fell off the chair when I read this paper. Just look at this. What in the world! It is at least half as good as a human expert in about 450 out of about 600 tasks. And, it is as good as a human expert in about a quarter of these tasks. That is mind blowing. And note that once again, the best part of this work is that we don’t need 600 different techniques to solve these 600 tasks. We just need this one generalist AI that does it all. What a time to be alive!
Segment 2 (05:00 - 07:00)
Now clearly, it is not perfect. Not even close. However, this is Two Minute Papers, this is the land of Fellow Scholars, so we will also apply the First Law of Papers, which says that research is a process. Do not look at where we are, look at where we will be two more papers down the line. So how do we do that? Well, we look at this chart, and, oh my. Can this really be true? This chart says that you have seen nothing yet. What does all this mean? Well, this means that as we increase the size of the neural network, we still see consistent growth in its capabilities. This means that DeepMind is just getting started with this. The next iteration will be way better. Just one or two more papers down the line, and these inaccuracies that you have seen earlier might become a distant memory. By then, we might ask “remember when DeepMind’s AI answered something incorrectly”? Oh yeah, that was a completely different world back then, because it was just two papers before this one. This one really keeps me up at night. What an incredible achievement. So, does this get your mind going? And what would you use this AI for? Let me know in the comments below! Thanks for watching and for your generous support, and I'll see you next time!