❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers
📝 The paper "Competitive Programming with Large Reasoning Models" is available here:
https://arxiv.org/abs/2502.06807
📝 My paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers
My research: https://cg.tuwien.ac.at/~zsolnai/
X/Twitter: https://twitter.com/twominutepapers
Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
I must admit I am a bit nervous. I am nervous because this is our 941st video, and it might be one of the most important ones I’ve ever done. I’ll try my best. In their latest paper, scientists at OpenAI won a gold medal, sort of, and in the process, they found something absolutely incredible about the nature of artificial intelligence. I have been waiting for this for a long time and barely anyone is talking about the paper. Crazy. I promise to get to the paper in a moment, but to understand it, we have a little gaming to do. You see, earlier, when we wanted to write a computer program to play a game, we had to give it instructions by hand. Go here, turn around, jump up, and so on. Then, as the first AI techniques appeared, many of them were able to learn a bit, but they were taught the rules of the game and some of the classic strategies. Think about giving a few books of opening strategies for a chess AI to learn. That sounds like a good idea, we humans have years and years of knowledge about the game. Of course. Why not help the AI? Well, that might be a big mistake. Why? Because if you teach it good strategies, it might never find the best strategies. Wanna see an example? This is the “You Shall Not Pass” game, where the red agent is trying to hold back the blue character and not let it cross the line. Here you see two regular AIs duking it out, sometimes the red wins, sometimes the blue is able to get through. Nothing too crazy here. This is the reference case which is somewhat well balanced. Now, look closely, because here comes the hacker adversarial agent. Ha! Yes, you are seeing correctly, this chap it doing nothing. Absolutely nothing. But it is doing nothing in a way that reprograms its opponent to make mistakes and behave close to a completely randomly acting agent! This paper was absolute insanity. And that is the key. If you teach the AI strategies on how to play, it might find some good strategies, but it won’t ever have the chance to find the best ones. And these are things that we would never have found ourselves. So, here is a strategy: teach the AI less, and let is learn on its own more. Okay, but I hear you super smart Fellow Scholars asking: okay Károly, this works on this little toy game, does this concept work on a greater variety of tasks? Glad you asked. Let’s see together. Have a look at this. So far we have talked about mastering one specific game. You train an AI that is able to do that. However, if you wish to play a different game, you need a different AI. One AI, one game. Now here is the best part: you can also train one AI to be able to play many games, and if you do that, something so surprising happens that it is one of the most stunning insights of my entire career. What is that? Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. It is the following experiment. You take a specialist AI, and let it master one game. Then you take a generalist AI that kinda knows this game, but knows other games too. Who wins? Of course, I say the specialist does because it played the game so much more. Imagine the olympics: you have a super muscular wrestler who did wrestling his whole life, and you get a scrawny guy who kinda dabbles in wrestling, swimming, and 20 other sports. Now these two wrestle. Who wins? Of course, the specialist, the muscle guy wins. Except, that he doesn’t. Look. The guy who dabbles in many sports, that is, the AI that played many games, but one particular game only a bit, it can beat a specialist that played this game a ton. I mean, what? That is insanity! And now, I can finally tell you about why this fresh new research is a stunner: OpenAI applied this concept to not just games, but to programming, and quite possibly, everything. So what did they do? They started using their AI to solve challenging programming tasks. So, what is the result? Now hold on to your papers Fellow Scholars, I shall present to you the Holy Chart. Their o1 system did really well. So far so good. Then, their specialist system did even better. This is a specialist. This is the muscular wrestler. It was shown handcrafted, human-taught data and strategies to excel at one thing. Is it good? It is not good, it is amazing.
Segment 2 (05:00 - 07:00)
Under somewhat relaxed conditions, this is good enough to win a gold medal. Whoa. That is stunning, and normally we would stop here, but it gets even better. They introduced a smarter, generalist agent, o3 that has no specialized knowledge and it learned on its own, so in return, it is probably worse and…what? Are you seeing what I am seeing? Once again, in a new area, the generalist beats the specialist. The scrawny guy beats the wrestler. Wow. But why? Because this AI learns something in one task, and is able to apply it to another. But wait, to me, that sounds exactly like intelligence. To me, this sounds like artificial intelligence is finally a possibility. Think of all the good this will be able to do, from designing new drugs to defeat previously untreatable diseases to giving a personalized teacher to every child on the planet, and so much more. Wow. To get intelligence, we don’t need to teach sophisticated strategies to an AI. No-no-no! We’re just holding it back. Instead, get a smarter AI and make it learn by itself, and it will do better. If will find the crazy strategies that we cannot find ourselves. So, yes it is possible, and it is simpler than we thought: we need simple algorithms, tons of compute, and you will likely get artificial general intelligence, possibly superintelligence. So much so that the o3 AI now ranks among the best human programmers in the world. And this is what Two Minute Papers is about. Of course, it is about the papers, but also about the wider context around the paper. Something that you don’t really get elsewhere. And with that, the age of AI is here. Subscribe and hit the bell icon if you wish to see more. So, what would you Fellow Scholars use this for? Let me know in the comments below.