# OpenAI's ChatGPT Now Learns 1000x Faster!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=057OY3ZyFtc
- **Дата:** 18.11.2023
- **Длительность:** 7:26
- **Просмотры:** 180,466

## Описание

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers

📝 The paper "Eureka: Human-Level Reward Design via Coding Large Language Models" is available here:
https://eureka-research.github.io/

📝 My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Gaston Ingaramo, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Károly Zsolnai-Fehér's research works: https://cg.tuwien.ac.at/~zsolnai/
Twitter: https://twitter.com/twominutepapers

#nvidia #openai #chatgpt

## Содержание

### [0:00](https://www.youtube.com/watch?v=057OY3ZyFtc) Segment 1 (00:00 - 05:00)

Here is an incredible paper that teaches virtual  humans to run in the most fabulous ways. So good!    So, what is this? How did we get here? These are  large language models, text-based AI systems that   can be our smart assistant, someone who can  draw images for us, we all know that. However,   fewer Fellow Scholars know that they are also  excellent at reading and writing computer code,   and can thus, even learn to play Minecraft. That  is incredible. Just think about it: this is a   text-based AI, how could it possibly control  a graphical game like this? Really amazing. So, at this point, we know what  these large language models can   do. The age of surprises is over…or  so I thought! Dear Fellow Scholars,   this is Two Minute Papers  with Dr. Károly Zsolnai-Fehér. Previous AI-based techniques can play Atari games,   and other games at a superhuman level. They  typically do it through reinforcement learning,   which means that they get a controller, look at  the screen and play like a human would. DeepMind’s   techniques are excellent at this, and thus,  they already do extremely well on these games. However, there is a huge problem. What is the  problem? Well, look. Scores. Games have scores.    And these scores can be used as feedback for the  AI to understand whether it is doing well or not.    High score, doing well, lower score, not so much.   And whatever other task we wish it to perform,   there needs to be a score. There needs to be  feedback. And therein lies the problem. Each   game has a different scoring mechanism,  and thus, each task needs a different   scoring mechanism. If we are aiming to create an  intelligence of sorts, any kind of intelligence   requires generality. So it has to be able to  perform tasks it hasn’t seen before. But how? Scientists at NVIDIA had a crazy idea. Let’s  use these ChatGPT-ish large language models   to write code to calculate the score for  these tasks by itself. Wow. I love it. But,   does it work? I think that cannot possibly work,   there are just too many kinds of tasks out  there. Well, let’s have a look together. Now, little AI, write code for the score  for this humanoid to run and then, train   it. Well, what can I say… it works. Well,  look, no one said it has to be beautiful.    We run. Running is making  progress moving forward. It recognized that,   and this already works, and the  whole thing was written by an AI. Now, get this. Let’s try this  some other way. First, little AI,   design a task where forward movement is  necessary. Wow. That is a fabulous way of   moving forward. Now let’s give it some feedback:  This looks like squat jumps, make the movement   resemble running a little more. So the AI says,  okay, got it, so you mean a duck walk! Well,   not quite! New feedback: the torso has  to be a little higher. Okay, good. But,   it is using mostly one leg to hop on. Please use  both legs. Goodness! Both legs are now being used,   but look. Hoo boy! Back to square one. Now the  torso is too low again, let’s ask it to penalize   that, and…there we go. Finally, something that  resembles running. It is not an easy task at all,   it turns out. And, this human feedback-driven  solution I think is way better than the AI on its   own. This intelligence of sorts is wonderful,  but human intelligence is where it’s at. What a fantastic paper! But, it gets better.   In fact, here comes the best part! This concept   generalizes to so many tasks, from passing balls  to balancing them, teaching robots to move,   and even the crows favorite: spinning pens. Very  impressive. But, how does this train so well and   so quickly? The key is that this learns within a  computer simulation which has two advantages: one,   it can’t really harm itself, and two, my favorite:  in the real world, one second means one second,

### [5:00](https://www.youtube.com/watch?v=057OY3ZyFtc&t=300s) Segment 2 (05:00 - 07:00)

but in a virtual world, one real second can be  simulated much quicker, given a powerful computer.    A little game can be simulated quicker. How much  quicker? Well, in this case, this little AI is   learning a thousand times quicker in a simulation  than it would in the real world. So cool! And it gets even better. Hold on  to your papers Fellow Scholars,   because it can not only match the  level of humans playing these games,   but the evolution-based variant can even  showcase superhuman performance. Wow! It   matches or exceeds the level of humans on 75%  of the dexterity-based tasks. That is insanity. Finally, an AI that can do these tasks and  have some sort of generality. What we are   seeing here is perhaps the early days of true  intelligence being born. And all this within   a little piece of silicon, and lots of  human ingenuity. What a time to be alive! However,   it is a research work, and it is not  perfect. Not even close. For instance,   please don’t ask it to close your doors. Ouch.

---
*Источник: https://ekstraktznaniy.ru/video/12918*