❤️ Check out Vast.ai and run DeepSeek or any AI project: https://vast.ai/papers
📝 OpenAI's o3 is available here:
https://openai.com/index/introducing-o3-and-o4-mini/
📝 The paper "Humanity's Last Exam" is available here:
https://agi.safe.ai/
ChatGPT trick:
Go to your profile - personalization - customize ChatGPT, and add:
“Look for peer-reviewed sources for every answer. Rate the credibility of that source on a scale of 0 to 10 for every source. And of course - use more emoji. ”
📝 My paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD
Sources:
https://x.com/DeryaTR_/status/1914133246465487026/photo/1
https://x.com/nicdunz/status/1913043509348643323
https://x.com/mckbrando/status/1913268371266932865
https://x.com/emollick/status/1913471315807191310
https://www.reddit.com/r/singularity/comments/1k1819k/o3_can_solve_wheres_waldo_puzzles/
https://www.reddit.com/r/singularity/comments/1k1819k/comment/mnk0sbn/
https://www.mdpi.com/2076-3417/12/10/5255
Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers
My research: https://cg.tuwien.ac.at/~zsolnai/
X/Twitter: https://twitter.com/twominutepapers
Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu
#openai
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
It’s not every week that see an incredible AI breakthrough like this, but this is that week. So, what just happened? Scientists at OpenAI showcased their new thinking model, o3, and so much more. It can now think with images, then help you learn more, and maybe even help you land a job. And, just a year ago, these AI systems had an IQ that was below the human average, and now? Are you kidding me? OpenAI’s o3 has a genius level IQ. Now I’ll note that we are Fellow Scholars here, we are a bit skeptical of such claims, especially that this does not come from a peer-reviewed paper. So let’s have a look ourselves. How does all this work? First, it is thinking with images. Little AI, here is an image, now tell me what is the name of the biggest ship here and where will it go next? And now, you can actually see the thinking process, this time with images, and there we go. An absolute miracle. Now let’s make it tougher. Your task is read this sign. Wait a second…my first reaction to this was…what sign? Look, there is a tiny sign on the building — there is no way you can read that, right? Well, check this out. It looks at it, knows where to zoom in, and I am pretty sure it is cleaning up this image automatically, trying to find out what’s written. And, there you go. Unbelievable. Now, here’s a picture, what movies were filmed here? That is not something that you can do with just an easy search, you would need someone in the know for this. Or, just ask o3, and two and a half minutes later, boom, there you go. And, yes, I hear you Fellow Scholars asking, but can it find Waldo? And the answer is…yes. Yes it can. You may even be able to just take a photo of a menu, and it might find the restaurant that has that menu. Or, just take a photo of Slash, or some guitar hero, and ask what chord they are playing. Hint: it’s an E. It is always an E. An don’t forget, it can not just talk about images, it can even mark them up. Here, it did so with where it found some issues with small samples of fabric. Incredible. And believe it or not, it gets better. Its memory is getting better and better. For instance, you can do the following: little AI, read everything that we talked about before, maybe years and years of data, and now — you can ask it to tell you what it knows about you and teach you something that you don’t know yet. In this example, he likes scuba diving and music. o3 knows that. And here comes something mind-blowing. It starts teaching him about coral larvae, underwater baby corals if you will. And they can detect reef sounds. They prefer the natural soundscape of healthy reefs and if they hear them, they are more likely to attach and grow when exposed to them. So, crazy experiment: put loudspeakers underwater, pipe in the sound of a healthy reef, and there you go. They love it. So much so that they swim toward the sound and settle there. Wow. Imagine someone knowing you so well and teaching you this. If this would be a person, we would say that this person is very kind and thoughtful. And we might say the same thing for a machine. I think that is incredible. And now that you know about this, you can ask again, it knows that you know this already, and then it can teach you something new. Maybe you are looking for a job as a research scientist, and once again, it knows what you know, but more importantly, it knows what you don’t know, and will be able to help you to prepare for an interview. But it can do more. A lot more. They showed how it did a big bunch of tests, okay, but what do these really mean, and which one of these matter? Well, I reckon you should look at this one. Humanity’s Last Exam. The toughest test for the toughest AIs out there. Questions from smart scientists from all around the world that no AI could answer at the time it was made. We just talked about it two videos ago, and I am super happy to see it tested on some more. I don’t assume that it has to do anything with what I said but I am loving seeing this. So, what is the result? Just a couple months ago, 8% was what the best AI could do at it. Now, hold on to your papers Fellow Scholars, because we are approaching 25%. 25%! That is stunning. About 3x in just a couple of months. I am out of words. By the end of the year, we might get over 50-60% and that will be for me, another breakthrough. A historic moment. Why?
Segment 2 (05:00 - 09:00)
Well, because we will probably have an answer to a super important question, and that is: Is it a bit like a real scientist? Can it do research? Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Well, here is an indication of what it already can do. You Fellow Scholars are going to love this one. It has been fed a piece of research work that is incomplete, and now it is asked to finish it. That is challenging task. And…look. This is really cool. It is showing us what it is looking at, zooming in, it shows you the thinking process. Once again, it thinks with images. Not just with text. Fantastic. So what does that give us here? Well, assuming that the experiment was carefully done, so that ChatGPT can not just look up the calculations somewhere, that seems to me like doing research. That is a game changer. You know, previously, you heard that these AI systems are nice, but they only can work within the convex hull of our knowledge. That means it can do what we can do. But from this point on, it seems that it can push us forward as well. And I said you Fellow Scholars are going to love it. Why? Well, because here, it is re-inventing something that has been invented. And very soon, I am hoping to see it invent things that we couldn’t invent yet. Pushing humanity forward. You know, advancing drug design, better crops, clean energy, longevity, these would really push humanity forward. And I think it is coming very soon. Incredible. They also announced Codex, a coding agent. But not only for coders. This is pretty much for everyone. Here you don’t even need to write a huge prompt, you just refer to something a different Fellow Scholar did on social media, that is, creating beautiful ASCII character art from an image. And it writes a real-time app that does this with your image from your camera and in real time. That is stunning. What a time to be alive! I would like to close with a little trick that I just started using. And I love it. In the app, go to your profile, personalization, customize ChatGPT, and there, ask it for the following: “Look for peer-reviewed sources for every answer. Rate the credibility of that source on a scale of 0 to 10 for every source. ” And of course, don’t forget to ask for more emoji! Ha! You can copy this little message in there, I put it in the description. So when you ask about the IQ of OpenAI’s o3, it shows you results, but not just the results but the difference between speculation and standardized testing. When it says 8 out of 10, I would give it more like a 5 or a 6 at most, but that’s okay — I can just refine my system prompt to make it a bit more strict. You can do it too. That is The Way Of The Scholar. Note that works on any chatbot. And now, whenever you get an answer, you get a source you can look up, or sometimes you might not even need to look up all the sources because ChatGPT knows they exist, but are not necessarily that credible. By the way, this video took a bit longer. We are not the ones who just put out a poor quality video super quickly to get all of those views. No. We’re cooking it slowly. You know that slow cooking takes time, but it is the best way to cook. Especially for papers. More context, more examples, more papers. Yummy. So in the end, you get better videos, hopefully. I am trying my best here. Like and subscribe if you appreciate it. So, how did your Scholarly experiments go? Let me know in the comments below.