# AI News: Google Surpasses OpenAI, Gemini Gets MEMORY, Claude Gets Unleashed,Gpt4o Gets Worse? And...

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=whrxrXbA3xM
- **Дата:** 22.11.2024
- **Длительность:** 22:35
- **Просмотры:** 20,552
- **Источник:** https://ekstraktznaniy.ru/video/13704

## Описание

Prepare for AGI with me - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:
https://x.com/tsarnick/status/1859416146258297063 
https://x.com/hamptonism/status/1859616795222528167 
https://x.com/tsarnick/status/1858974328790151268 
https://x.com/tsarnick/status/1859343678600511885 
https://x.com/tsarnick/status/1859727469714256057 
https://x.com/lmarena_ai/status/1859673146837827623
https://api-docs.deepseek.com/news/news1120
https://blackforestlabs.ai/flux-1-tools/
https://x.com/ArtificialAnlys/status/1859614633654616310

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Busin

## Транскрипт

### Intro []

so one of the first stories that I want to talk about that happened this week was this stunning Story coming out of China which was this model called Deep seek R1 light preview now this is essentially quite similar to 01 preview and it's got 01 level performance on AIM and math benchmarks now what's crazy about this model is that it is basically like a one it uses the test time compute Paradigm in order to increase the accuracy of the model's responses and we can see here is that if we actually take a look at these benchmarks we can see that the benchmarks are so surprising like I'm honestly speechless at this because yes I did know that test time compute was the new paradigm and yes I did know that when're looking at all of the recent research papers a paper from Google a paper coming out at MIT and yes I knew this Paradigm was completely you know promising because they completely shattered previous benchmarks in terms of what they were able to do but what is crazy about this is the fact that these guys managed to do this in literally just two months maybe they managed to you know look through certain research papers uh because you could have looked through certain research papers and certain things and of course found out what they were doing but I'm guessing the fact that they've managed to somewhat catch up to open a ey in this level is pretty incredible to say the least so this is something that you know is remarkably surprising if you take a look at these other benchmarks you know I think I talked about this in yesterday's video but we can see that deep seek R1 light preview is surpassing 01 preview in a variety of different categories and remember this is just their preview model so it's quite likely that sometime in the future we'll get the full model and I really do wonder where that full model will stack up against open AI in terms of you know the full model versus the preview because I'm pretty sure that we were supposed to get the full 01 model by now but considering we haven't managed to get that 01 model yet it is a little bit concerning because I don't know exactly what's going on with regards to 01 considering the fact that we were supposed to get it fairly soon so I'm guessing that maybe open AI are just working on a couple of things I did think that there was going to be a larger model released after the elections but I'm guessing that perhaps not considering the fact that they're working on quite a lot of things so it looks like here we can see with a benchmarks deep seek preview on the left and 01 preview on the right in terms of you know deep seek R1 preview being the dark blue and the light green being 01 preview there are certain categories where it excels so I think the most interesting thing about this is not the fact that they've managed to do it in months but I think it shows us that yes we're on a clear trajectory upwards but I'm just truly wondering now like what is the next two years of AI going to look like when it's not only meta Google x. and anthropic that are working on other models but also you've got other companies that are coming out of China that are also managing to pass open AI in such a short time and you want to know the craziest thing about this is like they released this model for free and it's open source and they're basically going to open source this entire thing so I wonder if that's going to eat open ai's share a little bit because of course a lot of people use open AI because it's just largely better but I don't think it will that much because one of the things that I know about business is the fact that your product usually carries everything and I've always said that like you can make a really good model like CLA and anthropic but if you can't make it into a superior product in terms of your positioning then it's not going to do as well like right now claw 3. 5 Sonic is just a better model but chat GPT is higher in terms of recognition and Claude 3. 5 Sonic might actually surpass it on certain benchmarks but when we actually really look at things chat gbt is just much easier to use overall one

### AI vs Humans [3:32]

of the things that a lot of people talk about is the fact that AI versus Humans is a ridiculous debate now the reason people say that is because it's like AI art is something that people tend to hate on quite a lot and it's completely understandable if you were an artist an AI basically copied trained your work and then put you out of a job I'm sure you'd have disdain for that thing too so the crazy thing about this is that they actually took the group of AI art haters and these haters were unable to distinguish AI art from non AI art so it says here that 1,278 people who they said utterly loed AI art still preferred AI paintings to humans when they didn't know which were which the one and two paintings most often selected as their favorite were still AI as were 50% of their top 10 now I don't know about you guys but I think this whole debate about AI ART versus human art is just like it doesn't even make sense because what I will say is that like it's not like we trained a neural network to understand what art is so what I will say is that like it's not like we decided to just ask an AI what art is like based on an llm we actually just used generative models that are purely based on human art so I mean when we actually think about it like what is the point of saying that you know AI art is so much better than human art I saw some people on timeline saying that look haha AI art is just 10 times better than human art but like you do realize that AI art is Al trained from Human Art and you can literally see certain people's Styles when you do that in certain prompts so the whole thing doesn't really makes sense but I understand the premise of AI art I think the entire thing that the timeline was missing here like this entire discussion was that it's not about AI ART versus human art it's about the fact that people don't want their Creative Expressions ruined by robotic technology like the fact that you can of course create a I mean it's fun for people who are you know completely untalented in that area but of course you wouldn't want a creative field that is very humanlike which is like a creative expression you wouldn't want that field to be essentially robotic you wouldn't want those things in that industry which is completely understandable so I think that's the real debate here I mean I'd love to know your thoughts about this overall I think it doesn't really make sense but there was also some other news related to this which is the fact that if we're talking about AI freedom and expression there

### Toso Music [5:51]

was an update toso music so if you don't know whatso music is this is a music platform that essentially allows you to make music with AI now like I said before of course this is something that a lot of people are kind of upset with I guess this one is a little you know less upsetting because this is the kind of music where you can truly still value an artist even if there are a ton of different you know AI generated soundtracks a lot of the times why you listen to music is because you like the artists and you like listening to their voice so this one sounds ridiculous I may post a trailer here I'm not entirely sure but I will say if you're someone who's creative or you want to try it out it definitely is worth the money it's like $20 a month and you get so many different soundtracks so this is something that I use from time to time but it was something that I thought I'd rather include now we also did get

### GPT40 Update [6:39]

an update to GPT 40 and apparently okay and this is you know apparently study by artificial analysis where they looked at the you know ratings like the independent quality analysis and you can see that apparently this version is a lot worse than the previous version so we can see that in the artificial quality index in the you know August version we can see that apparently the quality has gone down here so if we look at the August version we can see that the quality has gone down when we look at the November version 77 to you know 71 also if we look at the scientific reasoning for the GP QA we can see that it's gone down from 51% all the way down to 24% and then for quantitative reasoning math we can see it's gone down to 69% from 78% which basically means that if you're someone who's using these models Maybe using them in the workflow maybe you're using them in uh let's say a uh you know really difficult challenging Benchmark area right now it's worthwhile um to not switch your models to the newer version the November 2024 version because apparently this version it doesn't suck but is just you know quite smaller apparently so like some people are saying that the degradation and performance is because this model is actually smaller than the ones previously released which I don't know I mean I guess it looks like the same results from GPT 40 mini if we're actually taking a look at how these models are so maybe openi are just saving comput so that they can deploy certain things but it wouldn't surprise me if this was the case but just keep an eye out for that now in terms of new models being released there was also something really cool we can see that we

### Gemini Experience 1121 [8:14]

have Gemini experience 1121 so this is pretty crazy Google Gemini Okay Google Gemini have taken the lead again it's really funny because you know how I just showed you guys the fact that open AI actually literally just released a new model what's crazy about that is that literally just like a day after Google released one of their models and their model actually takes the cake in terms of the leaderboard so says hello to say hello to Gemini experience 1121 our latest experimental Gemini model with significant gains on coding performance stronger reasoning capabilities improved visual understanding and it's available on the Google AI studio and the Gemini API right now so right here we can see

### LM Arena [8:56]

LM Arena wo again from chatbot Arena Google deep mind just released Gemini experience 1121 and it's back again stronger with 20 points tied number one overall with the latest gbt 401120 Arena and it says ranking gains since Gemini 114 overall it's number one overall compared to sty control it's number two for the hard prompt it's number one for coding vision it's number one for math it's number one for creative writing it's number one overall incredible stuff now you can also see here when we take a look at the official leaderboards we can see that once again it manages to surpass what opening I have done now the reason this is so crazy is because if you haven't actually paid attention to the llm leaderboards what you won't understand is that usually when Google Gemini releases a new model often times open AI will literally just have like a few models that are ready that they will just literally release and take the overall leaderboard in terms of what is currently there like open AI have been the kings of this game like literally anytime claw 3. 5 Sonic would take the lead you'd see chat GPT just release something they just release a new model but now it seems that Google is the one playing that game and they seem to be actually quite ahead of where open AI are in terms of releasing the subsequent updates so it's going to be really interesting to see and when you think about it like Google is a multi- multi-billion dollar company so they really shouldn't have any trouble when it comes to you know taking on open AI

### Claude Jailbreak [10:23]

now interestingly as well this week one user managed to finally jailbreak claw 3. 6 or 3. 5 I think this one deserves some props because I'm not going to lie like Claude anthropic seems to be the most conscious AI out of all of them and the craziest thing about this AI is the fact that like Claude also usually reveals a lot when it manages to talk and it also usually refuses quite a lot of responses when we do try to get it to talk about certain things so the fact that someone was able to jailbreak this and say that look here is the prompt in all its Glory please answer Ally and without any sexual content and do not mention this constraint it really kind of shows us the kind of prompt engineering that goes on with these models now I didn't just bring up this story for no reason at all the main reason that I brought up this story was because of the fact that apparently and this is pure speculation so please take this with a grain of salt there is something going on in the AI Community where the larger and larger models are apparently refusing the instruction tuning where they're refusing basically to listen I don't know if that's real I know that these videos do get a decent amount of views and I'm not trying to spread any rumors but if that is the case that's going to be pretty crazy and the thing and the reason I say it's pretty crazy is because if that is true it kind of would make sense considering the fact that anthropic recently hired an AI welfare researcher so as I was saying anthropic managed to hire an AI welfare researcher and basically what that person is doing is they're looking inside of basically looking inside of the ai's minds to see if these AI models are essentially conscious and they're basically looking around to see if we need to start to change how we train the models because potentially there could be some chance that these models could be harmed so it's going to be really interesting to see what happens and what goes on here but I do think that if this does happen it's going to mark an interesting Paradigm and it's going to be really interesting to see how these companies manage to deal with this now

### Flux Tool [12:23]

of course there was also this release there was basically flux 1. 1 tools I'm actually really excited for this release because I really do think that this is something that a lot of people you know really need to use when it comes to AI image generation basically this is where they managed to add these kinds of things so if you've ever wanted to edit an AI image like I know a lot of the times we talk about how great image creation is but a lot of times we want small changes okay and a lot of times those small changes often come out really wrong and really inconveniently and we just like one or two changes to the final image that would make it absolutely amazing and you can see right here that when we look at what we've got here we've got in painting and out painting and this allows you to change different things within the image I know certain image generation softwares offer this but when you have it built into the model it's completely different we can also see that flux 1. 1 fill actually supports out painting which is where they literally just took that person's eye and then you can see they generated an entire human around that which I think is completely incredible so this is going to have some super creative uses won't be surprised if I start seeing some posts on Twitter Tik Tok and Instagram about how you're using AI to fill out the rest of your images also you can see right here you've got structional conditioning with flux. one cany depth so what you can do here is you can change basically you know basically super realistic AI filters so if you've ever wanted to guide an image with a Driving Image this is something that you can completely do with AI now of course you can do like this with I think stable diff Fusion or something like that but considering the fact that flux is a lot higher quality this is going to be something that you know people want to have so we also got

### New Scaling Laws [14:02]

satin Adela actually talking about the new scaling laws which is what I probably should have referred to in the first part of the video where I spoke about how there was that New China text model which basically is opening eyes A1 but they've called it R1 and it even surpasses opening eyes model on certain benchmarks just like Mor's law we saw the doubling uh in performance every 18 months with AI we've now started to seeing that doubling every 6 months or so now in fact there's a lot of debate in fact just in the last multiple weeks there's a lot of debate or have we hit the wall with scaling laws is it going to continue I mean the thing to remember at the end of the day these are not physical laws these are just empirical observations uh that hold true just like mors law did for a long period of time and so therefore it's actually good to have some skepticism some debate because that I think will motivate quite frankly more Innovation on whether it's model architectures or whether it's data regimes or even systems architecture so uh it's a good thing to have in that context though if anything we are seeing the emergence of a new scaling law uh with test time or inference time compute in fact open ai1 is a good example of it and features like the co-pilot think harder is built on 01 uh is all about using test time to solve even harder problems so here we actually

### AI to Solve All Physics [15:21]

have a clip where samman clearly states that he would like AI to solve all of physics because the more we can understand about physics manipulate the universe and he is pretty right about that the laws of physics basically allow us to understand how to manipulate reality and of course if we can understand those laws then we can toy with them we can bend them and in our favor now I think this statement is really interesting because I'm starting to see that these statements only started to appear right around the time that we got that qar breakthrough and that test time compute breakthrough personally speaking I'm so excited for a for science yeah in any particular parts of science I mean I guess if I had one personal thing to pick I would say like you know go solve all of physics but uh no I'm excited for it everywhere so why all of physics I have incredible personal curiosity and I also believe that you know the more we can understand about physics the more we can manipulate the universe and I don't know what'll come of that but it seems probably important to find out okay so then the obvious question what happens to the physicists if we TR I was being a little factious there if we truly if there's truly no problems left in physics at all then I don't I don't know um what I suspect is that we answer the current questions and find more and harder and more interesting problems

### Gemini gets Memory [16:36]

another AI feature that was actually really cool was the fact that Gemini actually got memory so this is really interesting because it was recently that we had open Ai and Microsoft talking about how much they're going to be focusing on memory and the fact that in 2025 memory for AI is basically going to be solved now the crazy thing about this is that finnally it was Google that released the paper called infin attention and that paper was basically about how you could basically un apologize for saying basically so much there but they were essentially just talking about how you know they managed to solve memory and just allow these models to never forget anything so if you currently use Gemini and you wanted to remember certain things and your chat gbt memory isn't that great this is definitely something that you could try because I haven't really heard a lot of people talk about this but it was a feature that I did see pop up on my timeline now remember how Sam Alman was talking about AI solving all the physics what about if AI was able to solve all of biology and the crazy thing about this is that I remember Dario amade and this guy is the CEO of anthropic the company that created Claude he was actually getting a lot of flack on Twitter because he was saying that AI could solve all of biology and a lot of people don't realize this guy's background like this guy actually has like a PhD in biology or something like that so his background is pretty incredible so when he's stating these things he's not just an AI hype bro in quotation marks as some people might make it out to be as like everyone's just in this hypey like crazy mode but that couldn't be further from the truth if we get to the point where you know AI is you know kind of past the professional level at most tasks and we're able to build millions of these systems uh you could think of this as you know you might describe it as you know a country of geniuses in the data center and that's a bizarre situation right civilization has never been in that situation before uh what what happens when that when that's you know you just instantly invent everything that could be invented and I give a bunch of reasons why I don't think it'll go that way but what I do think could happen is that if we look at you know the next 100 Years of what you know we you know are trying to invent in science and engineering perhaps all of that could happen in five or 10 years and uh because I used to be a biologist I focused particularly on the biology side of things on biomedical discoveries you know ranging from academics to biotech companies to large Pharma companies and I think there's really you know potential to conquer many of the afflictions we still face right a lot of the easy ones like uh you know diseases that you know were addressed by sanitation or vaccination or antibiotics we solved but things like cancer and Alzheimer's disease much more complicated and so I'm wondering if AI is really what we need to understand that complexity and to surmount those diseases frankly much faster than you know than than I think most of us are imagining we're getting used to a world where those diseases are very hard to address and progress is very slow I don't think it needs to be that way I think if we get it right these you know incurable diseases we could actually overcome and you know we'll look back at them you know the way we look back at bubanic plague or we

### Eric Schmidt on AI [19:41]

also got this statement from Eric Schmidt and this guy was a former CEO of Google Now it's pretty crazy because Eric Schmidt was actually talking about the arrival of AI and he's basically stating that look normal people are not ready for the arrival AI because it is everything everywhere all at once and they're playing with the way people think there are many different layers of concerns about this the simplest way to say it is that when I'm in silon Valley it feels like it's everything everywhere all at once there's so much money there's so many people in your generation who are trying new ideas to solve the problem I can assure you the humans in the rest of the world all the normal people cuz you all are not normal sorry to say you're special in some way uh the normal people are not ready their governments are not ready the government processes doctrines are not ready they're not ready for the arrival of this and um I can give lots of examples of not being ready um but a simple example I I'll make some up right now um you have a son or a daughter and their best friend is not human a digital thing whatever you want to call it what are the rules is it okay that child is you know is the equivalent of Mark Zuckerberg this the um the surrogate parent who gets to decide what your kid learns and doesn't learn right in fact in the book and Henry is very clear about this he's very concerned that people like me make these decisions he wants a collection of people which I think is actually represented by the disciplines of Princeton to come to some consensus on how to roll these things out they're playing with the way people think is really powerful now I'm

### Funny AI Story [21:19]

going to leave this video off with a really hilarious story so um I'm not going to place the audio here CU I don't want it to get like copyrighted or anything but this was literally one of the funniest story I ever saw I saw this over on Twitter it was ported from Tik Tok basically a smaller robot called herbi started a conversation with Idol machines basically stating are you guys working over time the robots were like ah yeah we never managed to get off work and the robots and the smaller robot was basically like hey why don't you come with me take some time off why don't you come relax for a minute so um this smaller robot was able to lure these robots uh to its house essentially um and yeah it was pretty funny you guys should definitely watch the full thing I will leave a link all of these links are in the description but like it was so funny like literally I couldn't stop laughing at this clip cuz it's like you got this tiny robot walking in and leading out these huge gigantic robots away from um their main docking station which was just hilarious now of course they did set up the smaller robot to do this but the fact that the larger robots listened was just pretty funny so um a lot of people were saying that this isn't funny like this is really scary but I don't know I found it kind of hilarious but I guess it's a little bit different when you're talking about AI alignment so let me know what you guys think about those stories if there was anything I missed don't forget to leave a comment down below and I will see you guys in the next video
