# DeepMind’s New AI: 10 Years of Learning In Seconds!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=A2hOWShiYoM
- **Дата:** 20.02.2023
- **Длительность:** 10:03
- **Просмотры:** 120,716

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 

📝 The paper "Human-Timescale Adaptation in an Open-Ended Task Space" is available here:
https://sites.google.com/view/adaptive-agent/

My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Edward Unthank, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=A2hOWShiYoM) Introduction

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. Today we are going to torment DeepMind’s  new AI called Ada with tasks that are   impossible. Kind of. And then, see  if it is smart enough to solve them. Now, you see, we talked about previous AI  techniques that were able to learn over

### [0:20](https://www.youtube.com/watch?v=A2hOWShiYoM&t=20s) Previous AI Techniques

time. For instance, NVIDIA’s little  knights were able to learn to fight   by themselves. But this took 10 years  of training. Not 10 years in our lives,   10 years in their lives as they live inside  a simulation, which, with a quick computer,   will only take a few days to simulate. In a different paper, AI agents started   out like this. And over time, they learned to play  football and some really advanced techniques. And   there was no referee, so they also learned  to be not too kind to each other. Ouch. So,   how long did this take? Well, these folks also  trained for years, in simulation time that is. And now, with DeepMind’s new AI, this agent  will hopefully be able learn cool new things,   and hopefully it will not take years. Yes,  we are going to build a virtual playhouse,

### [1:16](https://www.youtube.com/watch?v=A2hOWShiYoM&t=76s) The New AI

and play a little game. Throughout the game,   this little window to the world is what  the AI sees. Only we see the whole level. So now, little AI, you have one job. And  that is, to hold the black cube. But,

### [1:33](https://www.youtube.com/watch?v=A2hOWShiYoM&t=93s) The Problem

there is a problem. What is the problem? Well, of  course, the fact that there is no black cube! Not   one in sight. But we have a secret. The secret  is that if we touch the black pyramid with the   yellow sphere, out comes a black cube. However,  psst, this is a secret. The AI does not know about   this rule and has to find out by itself. And  you know what, let’s make it even worse. If it   touches the yellow sphere to the purple pyramid,  both get destroyed, making this task impossible   to finish. And you know what, let’s make it even  worse. Give it a strict 20 second time limit. So,   good luck with that, little AI! I can’t wait  to see this. This is going to be a lot of fun. So, let’s see. Round 1. It starts  exploring, likely to find the black cube,   picks up the yellow sphere, dashes  away, and…oh boy! Bad news. Real bad.    It proceeds to touch the purple pyramid,  you know what that means, right? Oh yes,   both of them get destroyed. Little does  the AI know, the task is now impossible. It still tries to combine some objects  together, maybe something good happens,   but we already know. Nothing good happens  here. And then, the time runs out. Now comes the interesting part. Did  it learn anything from this? I am so   excited! Let’s start again! Hmm…it says  let’s not do what we did earlier. That’s   a good start. It takes the black pyramid, and,  there we go! Good job! We got the black cube,   and now it is time…for a victory dance! Fantastic. So, is there any point in running round 3? Oh  yes. Yes there is! The goal is that we wish   to see what it learned from the previous  success. Did it do it just by chance? Does   it really understand what just happened? Let’s  see. Oh boy! It is going straight for the correct   answer. It really knows what just happened.   It truly uncovered the rules of this game,   and now it is busy optimizing its  route to solve it even quicker. I love it. And when running a similar task for two  of these Ada agents, they learn the rules of the   game independently, but that is expected. However,  what is not expected is…look at that! Wow. They   learned to throw to get this task done quicker,  and later on, they even learned to work together   to be even more efficient. So now, yes, it is time  to hold on to your papers Fellow Scholars, because

### [4:44](https://www.youtube.com/watch?v=A2hOWShiYoM&t=284s) Learning in Seconds

what you are seeing here is learning happening  not in years, but in a matter of seconds.    Learning is so quick here, it is happening  right before our eyes. I can’t believe it. So if we can’t believe this, what do we do? Of  course, we make the task even more difficult. The   previous task could only be solved by lifting  things. So how about creating a level where   lifting things make you lose immediately? That  sounds fun, right? This new level can only be

### [5:20](https://www.youtube.com/watch?v=A2hOWShiYoM&t=320s) New Level

solved by pushing. Of course, the AI does not know  that, so in round 1, it starts out lifting with   predictable results. This was not a success.   So let’s see what has it learned? Round 2.    Look, it starts pushing instead. And when pushing  the two cubes together, got ‘em! Good job. That   was super quick learning. Once again, the  learning happened right before our eyes. So, what do we do now? Of course, we make it  even harder. Let’s add a bunch of unnecessary

### [5:57](https://www.youtube.com/watch?v=A2hOWShiYoM&t=357s) Extra Rules

rules and a ton of objects to distract the  AI from finding the yellow pyramid. Which,   by the way, doesn’t even exist. Yet. See if  it can see through our little tricks. First,   some exploration happens, and transmutation also  happens. A lot of it. Something touches something,   and some other something appears. This doesn’t  even seem to make any sense. Note once again that   only we see these rules and the whole map. The  AI does not know anything about the rules and is   playing this game for the first time, and only  sees this tiny window to the world. And later,   my goodness, look at that! With a stylish move,   it chucks away the yellow box that is in the  way. Fabulous! Then it holds the purple box,   so what happens then? Look at the rules, yes, it  makes the highly coveted yellow pyramid appear.    Great! Then, it goes straight for the goal.   Learning is happening here too…and super quickly! And in a different cooperative level, each player  has to touch their corresponding sphere to get

### [7:18](https://www.youtube.com/watch?v=A2hOWShiYoM&t=438s) Goal

their pyramids, and when these new pyramids  touch, we are finished. For the first try,   they try to explore and eventually succeed. But  do they really know how they succeeded? Do they   know which actions were the ones responsible  for their success? Wow. For only the second try,   they absolutely smashed it. That is just about  the quickest and most effective solution that   I can imagine. Holy mother of papers. This  little AI is learning incredibly quickly.

### [7:59](https://www.youtube.com/watch?v=A2hOWShiYoM&t=479s) Why Well

And I have to be honest, when reading the  paper, I was a little worried if it could   learn at all. Why? Well, because it is not  getting any intermediate rewards. What does

### [8:13](https://www.youtube.com/watch?v=A2hOWShiYoM&t=493s) What Does That Mean

that mean? It means that this game is a cruel  teacher that does not tell the AI during the   game how well it is doing. Only when it won  the level does it tell the AI so. Before that,   no information is given to the AI whether it is  doing well or poorly. This is especially difficult

### [8:31](https://www.youtube.com/watch?v=A2hOWShiYoM&t=511s) No Information

if we need to perform a chain of actions  to win the level, like you see here. And,   my goodness, I am absolutely stunned that the AI  can still do it at all, let alone this quickly. So, in just one paper, we went from learning in a  matter of years to a matter of seconds. Wow. This

### [8:50](https://www.youtube.com/watch?v=A2hOWShiYoM&t=530s) Conclusion

truly feels like seeing history in the making in  artificial intelligence. What a time to be alive! So, what do you think? What would you use  this for? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13280*