❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum
📝 The three papers are available here:
Grade school math: https://openai.com/blog/grade-school-math/
University level math: https://arxiv.org/abs/2112.15594
Olympiad: https://openai.com/blog/formal-math/
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers
Thumbnail background image credit: https://pixabay.com/photos/laptop-computer-green-screen-3781381/
Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2
Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/
#openai
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we are going to have a little taste of how smart an AI can be these days. And it turns out, these new AIs are not only smart enough to solve some grade-school math problems, but get this, a new development can perhaps even take a crack at university-level problems. Is this even possible, or is this science fiction? Well, the answer is yes, it is possible…kind of. So, why kind of? Let me try to explain. This is OpenAI’s work from Oct 2021. The goal is to have their AI understand these questions, understand the mathematics, and reason about a possible solution for grade-school problems. Hmm, alright. So, this means that the GPT-3 AI might be suitable for the substrate of the solution. What is that? GPT-3 is a technique that can understand text, try to finish your sentences, even build websites, and more. So, can it even deal with these test questions? Let’s see together. Hold on to your papers, because in goes a grade-school level question. A little math brain teaser if you will. And out comes, my goodness. Is that right? Here, out comes not only the correct solution to the question, but even the thought process that led to this solution. Imagine someone claiming that they had developed an AI this capable ten years ago. This person would have been locked into an asylum. And now, it is all there, right in front of our eyes. Absolutely amazing. Okay, but, how amazing? Well, it can’t get everything right all the time. Not even close. If we do everything right, we can expect it to be correct about 35% of the time. Not perfect, not even close, but it is an amazing step forward. So what is the key here? Well, yes, you guessed it right. The usual suspects. A big neural network and lots of training data, the key numbers are 175 billion model parameters, and it needs to read a few thousand problems and their solutions as training samples. That is a big rocket, and lots of rocket fuel if you will. But, this is nothing compared to what is to come. Now, believe it or not, here is a followup paper from just a few months later, January 2022 that claims to do something even better. This is not from OpenAI, but it piggybacks on OpenAI technology as you will see in a moment. And this work promises that it can solve university-level problems. And when I saw this reading the paper, I thought…really? Now, grade school materials, okay, that is a great leap forward, but solving university-level math exams? Now we’re talking! That’s where the gloves come off. I am really curious to see what this can do! Let’s have a look together. Some of these brain teasers smell very much like MIT to me. Surprisingly short and elegant questions, that often seem much easier than they are. However, all of these require a solid understanding of fundamentals, and sometimes even a hint of creativity. Let’s see. Yes! That is indeed right. These are MIT introductory course questions. I love it. So, can it answer them? Now, if you have been holding on to your papers, now, squeeze that paper, and let’s see the results together…my goodness. These are all correct. Flying colors! Perfect accuracy, at least on these questions. This is swift progress in just a few months. Absolutely amazing. So, how is this black magic done? Yes, I know that’s what you’re waiting for, let’s pop the hood, and look inside together. Um-hm. Alright! Two key differences from OpenAI’s GPT3-based solution. Difference number one. It gets additional guidance. For instance, it is told what topic are we talking about, what code library to reach out for, and what is the definition of mathematical concepts, for instance, what is a singular value decomposition. I would argue that this is not too bad, students typically get taught these things before the exam too. In my opinion, the
Segment 2 (05:00 - 09:00)
key is that this additional guidance is done in an automated manner. The more automated, the better. Difference number two. The substrate here is not GPT-3, at least, not directly, but Codex. What is that? Codex is OpenAI’s GPT language model that was fine-tuned to be excellent at one thing. And that is, writing computer programs, or, finishing your code. And as we’ve seen in a previous episode, it really is excellent. For instance, it can not only be asked to explain a piece of code, even if it is written in assembly. Or, create a pong game in 30 seconds. But, we can also give it plain text descriptions about a space game, and it will write it. Codex is super powerful. And now, it can be used to solve previously unseen university-level math problems. Now that is really something. And, it can even generate a bunch of new questions, and these are bona fide, real questions. Not just exchanging the numbers, the new questions often require completely different insights to solve these problems. A little creativity I see! Well done little AI! So, how good are these? Well, according to human evaluators, they are almost as good as the ones written by other humans. And thus, these can even be used to provide more and more training data for such an AI. More fuel for that rocket. And, good kind of fuel. Excellent. And, it doesn’t end there, in the meantime, as of February 2022, scientists at OpenAI are already working on a followup paper that solves no less than high-school mathematical olympiad problems. These problems require a solid understanding of fundamentals, proper reasoning, and often even that is not enough. Many of these tasks put up a seemingly impenetrable wall, and climbing the wall typically requires a real creative spark. Yes, this means that these can get quite tough. And their new method is doing really well at these. Once again, not perfect, not even close, but it can solve about 30 to 40% of these tasks, a that is a remarkable hit rate. Now we see that all of these works are amazing, and they have their own tradeoffs. They are good and bad at different things and have different requirements. And most of all, they all have their own limitations. Thus, none of these works should be thought of as an AI that just automatically does human-level math. However, what we now see is that there is swift progress in this area, and amazing new papers are popping up not every year, pretty much every month. And, this is an excellent place to apply The First Law Of Papers, which says that research is a process. Do not look at where we are, look at where we will be two more papers down the line. So, what would you use this for? Please let me know in the comments below, I’d love to hear your ideas. And also, if you are excited by this kind of incredible progress in AI research, make sure to subscribe and hit the bell icon to not miss it when we cover these amazing new papers. Thanks for watching and for your generous support, and I'll see you next time!