# BIG AI News : OpenAI Surprises Everyone, Superintelligence approaches, Benchmarks Crushed...And More

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=1AMMmBsiTV4
- **Дата:** 04.04.2025
- **Длительность:** 36:35
- **Просмотры:** 46,636
- **Источник:** https://ekstraktznaniy.ru/video/13100

## Описание

Join my AI Academy - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:
00:00 – Microsoft’s Agent Surprise
00:36 – Copilot Button Incoming
01:02 – Excel Gets Smarter
03:03 – Video Models Compete
04:01 – Runway’s Character Upgrade
05:09 – Filmmaking Made Easier
09:16 – Underrated Model Emerges
10:54 – Image Leaderboard Shifts
14:04 – Perfect Product Photos
16:03 – Material Transfer Magic
17:21 – Design Revolution Coming
19:10 – Software Job Shock
21:02 – Layoffs And Upskilling
25:08 – AI’s Emotional Edge
26:15 – Sam Altman Hints
30:00 – Math Benchmark Twist
33:11 – Superintelligence Warning Signs



Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe an

## Транскрипт

### Microsoft’s Agent Surprise []

So, one of the first pieces of news this week is actually Microsoft talking and releasing a agent in their co-pilot suite. It's actually really interesting to see the differences in products. Many of you guys know that OpenAI and Microsoft actually have a partnership agreement and they're actually still releasing products on the co-pilot side. I know most of you who use chat GPT daily don't really use the copilot side, but there is actually a very large enterprise use case for the individuals that use copilot and the workspaces there. And Microsoft are still updating

### Copilot Button Incoming [0:36]

them. Satin Adella did tweet this and I wanted to show you guys this because there are actually going to be some major C-pilot upgrades coming soon that I think a lot of people are going to use. And also, I think you have to understand these C-pilot updates because copilot is going to be natively on computers. There's always going to be a co-pilot button there. So, it's going to be interesting to see how AI and technology is going to be meshed together immediately.

### Excel Gets Smarter [1:02]

researcher now is able to take these reasoning models and apply them not just to the web but to that entire richness I described of the enterprise data. We've done the same for data analysis. Now this to me I mean one of my favorite tools of all time has been Excel and now we go beyond even Excel to create a data analyst where you can give it any arbitrary data. In fact the most beautiful thing is you can give it two Excel spreadsheets that you just say go analyze and come back with all the insights. Let's say I work in product development and we're entering a new market. So, I need help developing a product strategy for our expansion. I enter my prompt and researcher immediately gets the work. You'll notice it's reasoning over all of my work data, not just one file. Just look at this thorough response. It's something in line with what I'd expect from a researcher on my team. And now I can edit this response in pages and bring in my team to collaborate. Now that we've got a great product development strategy in with researcher, let's turn over to analyst. We built analyst to think like a skilled data scientist. So you can go from raw data to insights in minutes. Now I have this really messy, highly complex data set. You can see thousands of rows and multiple tabs with customers and their monthly revenue. And none of this has been cleaned or contextualized. Usually to make sense of this data, I'd have to pull in a member of my team who knows Python, but we're going to give the analyst agent a try. I'll simply ask C-Pilot to help me come up with a way to easily understand, learn, and visualize my data. And here I've got the answer I was looking for with an electric visual to share with the team.

### Video Models Compete [3:03]

Now, it's also important to talk about the video leaderboards because there have been some very interesting developments. Video generation is one that is quite hard, I would say. And I would say that it's probably something that is going to continually improve over the years. And interestingly enough, currently what we do have is we have a situation where Clling 1. 6 6 Pro is actually the best model by a decent margin, even surpassing state-of-the-art for, you know, the West. So, you can see we've got Google V2, it's at 971 ELO in 355 appearances. The other ones like PABS, it's at 934 ELO. And we've got Minmax and even Runway. But what's crazy about this is that, you know, we've even seen Sora that's all the way down here in terms of video generation. But like I said before, video generation and AI does not stop at all. I do remember when

### Runway’s Character Upgrade [4:01]

they released Sora sometime last year. But what's crazy is that one of these systems got a major update and I think this is going to truly change everything. And I'm actually going to be using this myself. So the company that got a major upgrade was Runway. So what Runway did was they actually introduced consistent characters. Now, I like this the most because consistent characters is one of the hallmarks of a good film and it's something that allows you to really, you know, immerse yourself into the story line. It's pretty hard to, you know, sort of watch something and not have consistent characters. If the character is changing in subtle ways, it just really doesn't give us the realism that we need. And as humans, we've evolved to pick up on subtleties in these differences. So, this small change that Runway have implemented here, I think it's going to usher in a new wave of creativity because you're probably going to see short films and a lot more independent projects be created whereas previously you'd need a lot more work in that. So, this is something that is really cool. Now, not only did they

### Filmmaking Made Easier [5:09]

introduce consistent characters, which is, you know, a feature that I really, really like, they also introduced this feature right here, which is object consistency. I spoke about this in the previous video where I spoke about how runway are changing things. But object consistency is something that once again is super important because you know when you are watching videos often times with AI generated videos the objects do change and morph and do a whole bunch of different things that just aren't really you know in line with what we're used to seeing. So now we can have this object consistency that's going to make film making a lot easier and a lot better. And a lot of people, I don't think they understand where Runway fits in terms of the overall aspect of their video generation model. And I think it's now becoming clear where they actually fit. They fit in the short film/ big movie generator, you know, kind of video generator. I don't think they're meant to be used by the average person, but more so to be used by those who are genuinely trying to create a masterpiece because that is how you've seen the product advertised. For example, you know, if we take a look at something that they released, which is how AI and VFX are, you know, essentially becoming one, we have the situation where we can see that this does look like some kind of short film. It doesn't really look like an animated video, whereas, you know, with other video generations, that's what we've seen. Everything we see here is, you know, really thought out and has that filmrade look to it. So, it's pretty clear that I wouldn't be surprised if, you know, maybe Runway have a model in the future that's so good that film studios potentially start to even use that. And I think that's where the large scale application for that company is going to be. So, you can see right here that, you know, when adding the V effects with AI, they've shown how incredible it is. And this is going to be something that truly changes things. and it was only released like 2 or 3 days ago. So, we won't see the incredible changes to things now. But, it's quite likely that, like I said already, you know, you're going to start to see films, short films, independent projects produced at a much, you know, faster rate. And it's something that I really do like to see because it's something that previously you'd have to have, you know, thousands of dollars on budgets. You'd have to spend, you know, a lot of hours in the studio refilming things. I mean, I've done a few short films before, and it really is a lot longer than it looks. So, I can definitely appreciate this. And one of the things I spoke about in this video was with AI and VFX. I will say that once they are able to get visual effects to the point where it is as good as current VFX, there is no way in hell they won't use it because most of you guys may not be familiar with the fact that VFX at least traditional VFX is ridiculously expensive and ridiculously timeconuming. oftent times why there are movie delays is because the visual effects take so long and you know they pay these teams millions of dollars and not just that it's millions of dollars in simply rendering and computing how everything is looking essentially you know so that they can get it to look really realistic like you know water simulations fire simulations smoke simulations those things take hours to render guys like if you think that AI you know the inference on that is slow trust me, you'll be waiting days, weeks, and months for certain projects on computers like CGI to render. There are entire farms, okay, entire workstations, entire, you know, offices, just, you know, rows and rows of computers dedicated to rendering out stuff because it literally takes that long. So, this is going to have some really, really cool implications. Now, they weren't the only video model that was actually released. There was actually one that I think by far, okay, you know, they're not currently on the video image generation leaderboards, which is fine, but they are definitely, I would say, the most underrated image generator right now. And I'm actually going to be using them quite a lot. So

### Underrated Model Emerges [9:16]

this model is currently called Higsfield. And this one is so good because what it does is it natively gives you different camera angles that just work. So, this was a Coca-Cola ad that someone created with Higsfield, and it looks really good because, like I said before, if you don't have any film making experience, you're not going to be familiar with all of the different, you know, character zooms and camera zooms and different ways to place the character inside. You know, there's a million different ways that you can use the character creatively. But when you hop over to Higsfield, you can really just change things and, you know, really explore how the camera works around that scene. And I think it's one of the most important things before because usually when you actually put an image into a video generator, you're basically playing Russian roulette. And the reason that's so frustrating is because you don't have so much control over what happens in the scene. You essentially play something in, you type in a prompt, and say, "Okay, the kid is going to run down this hallway. " Sometimes it works, sometimes it doesn't. Sometimes you'll want the camera to pat around and it doesn't work. But with this, it actually performs the best in my honest opinion out of every single video model when it comes to single camera shots that are really perfect. So, if you want it to rotate, to, you know, do a specific camera angle, it just performs really well. And I'm surprised it does that just well. Now, we're actually going to talk about something else. You know, if we're on video generation, why don't we take a look at image generation? And image generation surprisingly got a major upgrade. Now

### Image Leaderboard Shifts [10:54]

this entire section of the video should show you all just how quickly the AI space moves. Most people don't realize that one month in AI is like 6 months or a year in any other industry. And it's crazy to see how quickly things are changing. So we had this company called Reeve Image and they essentially released something that was really cool. So they released an image generation model. You can see right here that on March the 31st Reeve Image Half Moon were the number one on the Arena ELO with over, you know, 1,19 and over 10,000 appearances. They were doing this, you know, under an anonymous name. And of course, we can see that they even managed to surpass GPT40, Recraft 33, and Imagen 3. And my favorite model at the moment is currently Imagigen. Now, remember guys, this was only around 3 days ago. And when we actually take a look at this model, it actually is really, really good at doing super difficult lifelike scenes. I personally did use the model for a few different projects, but what I've realized is that this model is not really good at CGI and 3D animation and those kinds of images, but it's more so focused on real life subjects like humans, animals, and really striking poses that really just feel supernatural. Like this is a Vogue book where they have different head shot for Timothy Charlay. And it's a really interesting way for the person who prompted this to capture just how good the model is. Like if I saw this, I wouldn't even have thought for a second that this is AI generated. And there's no giveaways. Literally no giveaways. And one of the key things as well that they did that they really decided to focus on was they decided to make the text really, really perfect. So, in this image, you might not be able to see it, but in the top right, there is actually text that says Timothy Charmé wearing a St. Laurent uh leather jacket, and the text is actually perfect. So, the fact that they can get that tiny, tiny text to look so perfect indicates to me that they really spent time on this model making the text really, really impressive. So, this is a model that tons of people at the time were thinking, "Wow, oh wow, this is incredible. This is crazy. " and they're, you know, basically free because you could use it on their website as a free preview. Now, of course, like I said, AI moves incredibly quickly. It moves much faster than many other spaces. If you release a model today, there's no guarantee that your model will be number one by the next day. And that exactly happened only 2 days after this model was released. We got an update to the image gen leaderboard. OpenAI decided to once again take the number one spot with their incredible new image generation tool. And it wasn't the only one that managed to dethrone Reeve image. We also had Recraft V3 which was also very impressive and GPT40. Well, let's just

### Perfect Product Photos [14:04]

dive straight into that. So, we had this for ad creatives. You can see right here we could generate an image. You can see it said, you know, generate an image with this bag and put it on a model, put it in Paris. I don't think you understand how much time this is going to save. Let's say you're a small brand owner from the UK. You wanted to do a photo shoot in Paris. You'd have to book a flight or book travel. Then you'd have to, you know, get your bag. You'd have to probably get a person that you could potentially use as a model. Then you'd have to set the scene. You'd have to think about the posing. You also have to make sure that it was the right time because if it's not the right time, then the lighting could be affected. I mean, if it is raining, it's not going to work. If it's snowing, extremely hot, it's not going to work. Things have to be right. There are so many different things that go into play. And you could do all of this now by simply taking that image and one prompt and just simply, you know, negating all of that in a quick session with your AI model. So for me, this is remarkably impressive because if you own any kind of business, you can generate images that look remarkably realistic with your product. So this is a huge change to the e-commerce industry. I'm not sure how I would feel about this if I was potentially a model. Of course, there are really minor mistakes that it doesn't get just right just yet, but I don't think for the large scale use case that it's going to be applied for that it really does matter, especially on designs like this. On more intricate designs where you have really specific logos, it does make mistakes and I have seen that. But on cases right here where the design is, you know, it's literally just got a pattern style texture, that is something that is completely fine and works. Now, so now they also have character consistency and material consistency, which is uh pretty crazy. So this is material transfer. So

### Material Transfer Magic [16:03]

I think that this is one of the craziest things I've seen before because I did a video on this. And I think something like this is remarkably impressive because what you can do with this is just completely endless. The the use cases are just, you know, out of this world when we start to think about where we can go with this kind of thing. I mean, of course, there are small things that you can do like small little images for you and your friends, but there's a bunch of different use cases that completely changes things because graphic design is the way that we communicate with images in terms of, you know, being able to sell products and communicate different things. And I think that, you know, a lot of people haven't truly realized that yet. So, you know, things like material transfer, I mean, you know, if you wanted to do this traditionally, it would take a really long time. you'd have to somehow get a 3D model of the girl, then get this material right here, then transfer it onto the girl and actually make it look actually nice. So, uh there's a lot to be said about GPT40 image. I did make a video in which I spoke about the 15 different ways and use cases that people are using the model and I think it's a really informative video that you probably should watch. Now, one of the last ways I'm going to talk about here is of course infographics. Another way that people are using this creatively is to create different infographics for

### Design Revolution Coming [17:21]

certain things. So I think it's really important that you know you do take the time to understand that this is a changeover moment. Again like we've had with the coding moment I think you really need to understand that with design with a lot of these AI tools they just give you more agency. They just allow you to do more with less. And because of that, you can now go ahead and really create and build a lot more things that would take a lot more people. And it means that over time, we're definitely going to get a lot more products. I mean, this infographic right here, I mean, on the model, a ghost mannequin, studio, flatlay, close-up, lifestyle, the group, the detail shot, the hanging, that is really impressive. And this would take a really long time to do if you were doing this in Illustrator. I mean, it's really, really surprising how good this is. And I would say that you know of course sometimes there are minor mistakes and minor you know things that you know don't work well but um of course I would say that you know definitely pay attention to this. Now remember what I just said okay I just said that this is quite like a changeover moment and I think that this is quite important because what this means is that you know you may not need designers for certain things anymore. And the thing about that is that of course could lead to layoffs. So, let's actually start to get into this because of course some people are basically saying that. Also, if you guys didn't know, I have relaunched my AI grid academy. You can join over 100 members using AI agents and prompt templates to make money. Currently, I'm doing a walkthrough challenge with a community where I take a new business from zero to 10K a month with AI. That's something that I'm currently working on and doing it with members step by step. And also, I just dropped a private video on how you can actually use Claude to 10X your business and workflow. So, if that's something that intrigues you, don't forget to check that out. There's going to be a lot of content coming soon. Graphic designers are now going to be laid off. But take a look at what this person was talking about when it comes

### Software Job Shock [19:10]

to AI. So, this person made a post on Singularity, and this post went viral. It said, "Well, my entire software engineering team was just laid off because of AI. " It says, "Honestly, I feel physically ill. I just need to vent. Sorry if this is incoherent. This isn't the right sub. I put my blood, sweat, and tears into this field. I worked my ass off to get this degree. I ended up joining Fan. I made tons of money. I quickly burnt out. But overall, you know, the company started adopting AI last year. It started super light, but they started investing more and more into it. And I'll be 100% honest with you. I was skeptical, so I never used it or studied about it much, but and I thought it was another fad. Well, I guess not because apparently we have enough productivity gains from it to make our team's work redundant. And all of the work moved from our now non-existent team to another team. Now, I don't think that this is the reason that this person got hired. I do think that this is more sort of a human reaction to the changes going on in the economy. And let me be clear, guys, that's not me, you know, invalidating this person's feelings or their experience at the company. The company may have decided that yes, we now have AI gains, so we no longer needed coders. But what I'm saying is that on a broader landscape, I do think that software engineers overall are going to be needed more. One of the things I've seen and I constantly reference is the World Economic Forum document where they actually did tons of research into the most popular jobs in 2030 and the industries that are actually growing. And because you have all of these AI tools, what that creates is that creates individuals who are going out now and creating their own apps. And if those apps become successful, you actually do need a really experienced software developer to, you know, really manage the software because they understand the code at more of a base level. So they understand the different levels of abstraction rather than just your traditional vibe coder. So whilst

### Layoffs And Upskilling [21:02]

yes this isolated incident might be to AI and there will be some in instances where AI does replace people I think for the most part CS majors are not completely cooked because I do think that there is going to be more adoption of AI coders and more code in the near future. Now if we're talking about layoffs the CEO of perplexity actually talks about the dystopian part of the labor replacement. it is u unfortunately in the short term there's going to be a lot of labor displacement. Uh not as many people are needed to get a work done anymore. Uh so how people upskill themselves and adapt uh those who using AIS are definitely going to be well positioned. Um so all that stuff is going to take place and how people react to it. It's already like you know not you don't need um to build 10,000 people companies to be a trillion dollar company anymore. So definitely where are the next generation of graduates getting jobs? Existing big techs are laying off people or like not hiring more. So all this stuff is definitely going to impact like the market and u it's very interesting that simultaneously while creating new value and making software creation easier and uh we're also like displacing existing labor and value. So how people deal with all this is going to be interesting to watch and uh I don't think anyone really knows how it'll all play out. And you know someone who is a futurist Nick Bostonramm who you know has written books on the future of AI actually talks about the fact that even though there might be some layoffs you know shortterm you should probably hedge your bets with regards to AI because it doesn't make sense to completely switch careers just based on what's happening now. One thing that we do know is that the future is uncertain and that's the only thing that we can be uncertain about. So the key is of course adaptability. So for example post black plague one of the best times to be a peasant in human history because so much of the labor force was called that uh they had a lot more bargaining or negotiating power. And so I think you hear these stories of them having like holidays for like half the year or something like that because they had so much more uh bargaining and negotiating power. It seems like AI again on the way there will do the reverse, right? It'll make labor extremely abundant. And so practically, if I'm uh someone in law school or someone in medical school training 10, 20 years before I can start making capital and I'm building up my labor, I'm building up a skill, that trade-off suddenly seems a lot less attractive. Um yes, human capital is a depreciating asset in this world of advancing AI. Um and so investments with very long payback times. Uh if if especially if you're doing them only because of the ultimate payoff of a higher salary 20 years into the future or something would should be discounted, you know, uh accordingly. I mean, you would have to have scenarios where AI development takes longer or where it gets so regulated that they can't perform these particular jobs. Well, potentially uh all of it. Um but we don't know how long that will take, right? So, it makes sense to hedge your bets a bit. You don't want to find yourself, you know, 18 or 20 going on the labor market with no skills. And it turns out the whole AI transition has been delayed. Um, so yeah, I would like want to make sure to hedge your bet, get the broad basis, some of useful stuff, but then also like um have fun while it lasts. I think uh it would also be a shame to have wasted your childhood. Uh another thing that I found super interesting here was that there was a study that proved that users favor LLM generated content until they know it's AI. So in this paper what they did it found out that people prefer AI to humans in interactions and this is particularly prevalent in medical treatment. On one hand because people which are in this case the doctors have too little time to engage with patients questions and on the other hand because

### AI’s Emotional Edge [25:08]

AI conveys more emotional quality more empathy. This in turn lowers the inhibition threshold for asking questions so that in the end more knowledgeable and understanding can be generated. So overall we can see here that you know in many cases AI actually does outperform humans and humans actually do like that until they realize that they are talking to a machine and I do wonder if this stigma around AI will ever be replaced. I do think you know humans will always have a preference to talk to another human. I don't think that will ever go. But in some areas, I think over time it may shift. One of the reasons I do like AI is because, you know, these models do know a lot and you can just ask a million different questions like cop on a consulting call really quickly and it's never going to get mad at you. It's never going to, you know, hang up the call. It's just going to answer your questions for as long as you want and it can explain pretty much anything that you want to know. And I think that's a real benefit of AI and that's something that, you know, really helps in the medical field where individuals have a million different questions about their condition, their diagnostics, you know, how they're supposed to treat themselves and AI can

### Sam Altman Hints [26:15]

do that with a level of empathy that some individuals just simply don't have. Now, Sam Alman had a crazy statement because he also responded to this tweet and someone said, "Just wait till Sam Alman launches an update for music. " And he posted some eyes here. So you can see right here that this is a response to a tweet in which they are talking about the music model. Now OpenAI previously did have a music model called Jukebox and it was really good for its time. So it's quite possible that they will release another sort of user interface that you can generate music from. I'm not sure when that's going to come. There isn't much on the internet about it honestly. But OpenAI have a track record of upending famous companies. Like right now, the companies that are doing well are Sununo and Yuio. And I wouldn't be surprised if they just launched a model that was able to generate fullyfledged songs in a way that was remarkable and people didn't realize how good it was. Like I wouldn't be surprised if that took over the internet in the next couple of months for another week. So Samman here is tweeting that look we might be working on something and I think it's going to be super interesting when it releases because people once again the entire mind share they're going to be like wait a minute we can now create music really quickly and they're probably going to start doing that which would be of course really interesting to see how that all performs. Now you know this wasn't the only thing that Sam Alman did tweet he actually talked about the open-source nature of AI. So what he actually spoke about here was the fact that they are excited to release a new powerful openweight language model with reasoning in the coming months and they want to talk about devs how to make it maximally useful. So this was probably one of the biggest pieces of news that went under the radar because it was just a tweet. It wasn't a real publication and it was super interesting because OpenAI for the most part their company was founded to create open you know source research but it was only until DeepSeek did they feel the pressure and the need to sort of open source that AI product. Now we can see here that they said we are very excited to make this a very good model and we are planning our first openweight language model since GPT2 and we've been thinking about this for a long time but other priorities took precedent now this feels important to do. Now before the release we will evaluate this model according to our preparedness framework like we would like for any other model and we will do extra work given that we know this model will be modified post-release. We still have some decisions to make. So, we are hosting developer events to gather feedback and later play with early prototypes and we'll start in San Francisco in a couple weeks followed by sessions in Europe and APAC. And if you're interested in joining, please sign up at the link above. We're excited to see what developers build and how large companies and governments use it where they prefer to run a model themselves. Now, this is super interesting. Many other companies and you know businesses have focused on pising themselves as undercutting open AAI by saying look we have an open- source model ours is best theirs is closed source it's completely you know expensive but what is going to happen to those companies when openai manages to make an open- source model that is better than theirs that is faster than theirs and is cheaper to run are those other open source companies going to go out of business one of the reasons they currently exist is because openAI stuff is all closed source So maybe OpenAI is undercutting the undercutters. So it could be really intriguing. These guys have been like, "Okay, we're going to undercut OpenAI by releasing something cheaper, faster, and free. " But if OpenAI does that, are they going to potentially get rid of all the competition potentially and it could be

### Math Benchmark Twist [30:00]

something that works in their favor? Now, when it comes to benchmarking these models, there are some surprising results. So there was a new benchmark that, you know, it was released and it was essentially about math. So this is called US SAMO, okay? The USA mathematical olympiad and students go through two easier competitions. The AMC2, which is like the entry round, which is multiple choice. Then the AM, which is a bit harder. You got a short answer. And from 37,000 students, only about 265 make it to the USA MO, which is the USA mathematic Olympiad. And this is where the problems are super hard proof problems. So this is where we look at how AI performed here. So the AIS on this, they performed well on the AMC12 and the AME, but when it came to this new benchmark, they usually, you know, flopped. They got like zero out of 42 and they couldn't get more than one out of seven on most problems, which is like 14%. So they were failing these AI systems because, you know, they are great at multiple choice questions. But then the test suddenly switched to write a full, you know, essay, you know, where you have to explain your thinking in perfect detail. And of course, these LLM systems have never really practiced essays. And that's what's happening with these AIS. They were trained to give short answers. And for the USAMO, this is basically where we have, you know, questions that are saying, prove that this mathematical pattern always works no matter what number you pick. And it's really, really tough even for humans, not just AI. So, you know, the author of this article, Greg, was basically, you know, testing 03 mini high to see if with little help it can do proofs. And what he managed to find was that even though AI is, you know, currently quite bad at proofs right now, they're really, really close. And he thinks that's the next versions of AI, slightly smarter ones, could actually start solving these level of mathematical problems. He says, you know, AI has the pieces, it just struggles to put them together in a clean proof. So the crazy thing about this is that we got surprised. We had Gemini 2. 5 Pro which actually managed to surprise absolutely everyone. You can see right here it got a 93% on one of the questions. Overall it got 24% which is a huge improvement compared to these other models. And this was something that really shocked me because in fact it didn't actually really shock me because I do remember that Google spoke about how they got 90% math on a benchmark and everyone just completely forgot about it and Google have been talking about their math innovations for a while. So this is something that didn't really surprise me and the reason it doesn't surprise me as well is because you know I don't think the average person who uses chatbt is writing mathematical proofs or even trying to figure out how to you know use that in their day-to-day life. So even though Google is excelling honestly in that area, you know, really ahead of anyone, I think it's something that they really just unfortunately don't get credit for because it's not something the average everyday user wants and thus they don't realize just how good that thing is. Now, it's super crazy cuz I've, you know, never really used math that much in terms of, you know, on a day-to-day basis at the level at which these models can do it. And I think in

### Superintelligence Warning Signs [33:11]

the future, you know, it's going to be really interesting to see how further Google, you know, these models get in terms of their actual performance. And if we're looking at AI systems, if you think, you know, how smart are these AI systems? Well, Yan Lan, you know, basically says that these systems aren't even that smart. But then you compare this with the amount of information that gets to our brains through the visual system in the first four years of life, and it's about the same amount. In four years, a young child has been awake a total of about 16,000 hours. um the amount of information getting to the brain through the the optic nerve is about 2 megabytes per second. Do the calculation and that's about 10^ the 14 bytes. It's about the same. In four years, a young child has seen as much information or data as the biggest LLMs. What that tells you is that we're never going to get to human level AI by just training on text. We're going to have to get systems to understand the real world. Um and that understanding the real world is really hard. And that's something that Yanukan has said for quite some time. And honestly, I agree with him for the most part. I posted this clip on Twitter and I wouldn't say it got a lot of views, but there were people that were, you know, pretty much disagreeing. And I do think there is a level of intelligence in these AI systems the more I look into them. But I do think that, you know, trying to get to AGI without the physical basis that humans have is going to be really, really hard because it's like trying to replicate something without the core component of that thing. I mean, you know, if you're trying to get to AGI, which you could say that the average human has, how would you do that without, you know, the physical senses and the vision and the touch and the smell and all of those senses that makes up, you know, what a human is. So, with that being said, I think that was something that was rather important to include. And of course, another, you know, AI godfather, Yoshua Benjio, is actually talking about the fact that all these papers, and, you know, you may have not seen them, but they're talking about fact that, you know, it's pretty clear that we may lose control of AI. like these AI systems are really not telling us what they're thinking. For the most part, it's becoming pretty clear that we are on a trajectory to build AGI and super intelligence and the timelines are getting shorter. Keeping that in mind is is crucial, right? It's not the AI that exists now that needs to be secured. It's the exact the AI that will exist in one year, two years, three years that will have much greater capabilities. Intelligence gives power uh and power can be used in many uh good and bad ways. Of course, I want to you know be also blunt that the elephant in the room is loss of human control. Uh we are seeing signs in recent months um of these systems having self-preservation behavior and power-seeking behavior. uh for example trying to escape uh when they know that they're going to be replaced by a new version or uh faking being aligned with their human trainer so that they wouldn't change their goals and you know basically preserve themselves in some ways. So um when we think about security we need to think about these two aspects. humans uh trying to uh you know misuse these systems exfiltrate model weights um you know whether to run them without safeguards um to fine-tune them to something bad or see some other uh economic or military advantage. But we also need to think about the AI systems themselves as uh sort of insider threats.
