# Stunning New OpenAI Details Reveal MORE! (Project Strawberry/Q* Star)

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=MG4z1LSEvqw
- **Дата:** 12.08.2024
- **Длительность:** 23:05
- **Просмотры:** 39,681

## Описание

Prepare for AGI with me - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/

00:00 - Introduction: 
00:49 - GPT-4o update rumors
02:19 - Tokenization 
04:26 - Potential GPT-4o large release date
05:30 - Jimmy Apples
07:41 - GPT-4o model sizes
09:58 - capabilities reminder
11:49 - Comparison with state-of-the-art models
13:18 - "Strawberry" and OpenAI's tiny model claims
15:46 - Ilya Sutskever'
17:20 - DAGI levels
18:38 - Safety c
19:30 - OpenAI's preparedness framework
21:28 - Summary and future predictions
23:00 - Conclusion

Links From Todays Video:
https://x.com/shaunralston/status/1822728918727893003
https://x.com/gdb/status/1790869434174746805

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Содержание

### [0:00](https://www.youtube.com/watch?v=MG4z1LSEvqw) Introduction:

so there have been a few more details revealed regarding the strawberry update now this is rather fascinating considering there have been various rumors circulating Twitter Reddit and various different social media platforms that all regard to AI development so this video is going to try and clear up a lot of the information that's been going on in the space because I do know that there has been a ton of speculation and things have been rather confusing as of least so a lot of this stuff is going to be coming from the account IR rule the world M Mo and a lot of this is also going to be grounded in real factual evidence that we can of course verify so I'm going to try to keep this video as non-speculative as possible and as factual as possible without any hype or complete disregard for what is actually true SL possible so one of the things

### [0:49](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=49s) GPT-4o update rumors

that this account did recently tweet was that this account tweeted that there has been a GPT 40 update and I'm going to talk about why this ties into a further point because there have been some recent Revelations regarding GPT 4 so Diego AI tweeted that GPT 4 got updated out of nowhere and IR rule the world Mo has been you know ushering a tweet on Twitter essentially stating that there is going to be some kind of update to GPT 40 in the future now I don't think that this is the update that they were referring to I think what was of course an update that open aai may have pushed out because recently what has happened is that Google if you check the benchmarks they've taken the lead in terms of having a model that is rather good at answering humans requests and of course formatting them nicely and in terms of how the models tend to reason and of course claw 3. 5 Sonet has actually been in the lead for quite some time so open AI pushing out an update like this isn't out of the ordinary however one of the things that I've recently tested with GPT 40 and some people have tested GPT 40 stating that look it's marginally better you can test this yourself so in order to just quickly verify this or not I did my own short test on GPT 40 and I said how many L's are there in the word laloa and they actually you know it got this question wrong now this doesn't mean that GPT 40 is stupid by any means it just basically means you know one of two things either a the GPT 40 model

### [2:19](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=139s) Tokenization

that they have isn't updated recently as some may have suggested and maybe that my addition they just didn't you know roll it out to me cuz you know sometimes open AI can roll out different instances across different countries and different time zones so maybe it's in America maybe it's in Brazil I honestly don't know where it's being rolled out right now but for me personally I've tested this but this issue that people keep questioning the you know issue of Canon model actually count the letters whether or not and of course is this why they called it project strawberry or not I don't think this issue is as big of an issue as people are making it out to be because this is due to tokenization anyways as Sean Ralston put it that yes of course there are some improvements but if you actually understand how llms process text they break it down into different tokens so they don't see individual letters as we do they see things as tokens and then they use these tokens to like you know uh basically see how things you know have relationships with each other and then of course they use that in a sense to kind of build a world model so when you ask a model to count the number of you know letters in a word it kind of struggles to do that so you can see it says you know the way a word is tokenized can change depending on the context within a sentence and of course for instance the word strawberry might be split into multiple tokens such as straw Berry or even St and wberry depending on the model's tokenization rules surrounding the text and this is completely true um words are essentially you know done into token so that's why some people say that this thing you know is not AGI y y but this doesn't really matter because as I've always said um if you can solve it with prompt engineering I mean what is like what difference to make is if you can change a few words and then you can actually get the right output so I said write out each letter of the word verify if it's l or not and then count that and then you can see right here it actually managed to get the word right so this whole strawberry can it count the letters word I don't think that this is a really important reasoning thing because you know prompt engineering kind of solves this and it's not like you know prompt engineering isn't going to be used in some of the most advanced reasoning cases where we do want real answers to things and even

### [4:26](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=266s) Potential GPT-4o large release date

if you're trying to use llms for certain applications prompt engineering is going to be something that you use anyways I think that you know this entire issue of you know can it count the letters or not I think it's just a null point that you know doesn't really need much covering but back to GPT 40 so essentially with this strawberry um account essentially this account that was you know tweeting a lot of stuff one of the things they did tweet was a GitHub link now I was actually on Twitter at the time I thought they were going to you know release something like an open- Source model considering all of the recent pieces of information but what they actually did was they put this date as GPT 4ar for 2024 the 8th of the 13th so what this is a date for Tuesday and essentially the reason that I think this is Tuesday is because as you know if you've been paying attention to the AI space the majority of open AI updates or big Tech AI releases are on a Tuesday or Thursday so considering the fact that this date could be the release date for a model this is rather fascinating now the

### [5:30](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=330s) Jimmy Apples

reason I think this is even more fascinating is because of a recent tweet from someone who's actually you know got a long history of being a very credible AI leaker and that is Jimmy apples and I don't really want to say leaker because he doesn't really leak stuff he more so you know hints at you know next big model releases so um it's not like he's you know leaking entire things but you know his hints have been rather accurate over the past few years and he tweeted you didn't count Sund and that's you know Sund Pai the CEO of Google out did you would be very wrong to do so he's got a rematch next week and essentially one thing that we've seen with the past a mod releases is that every single time Google has demoed something that's been absolutely incredible it's always been overshadowed by open ai's releases so for example with uh if you remember project Astra Google's amazing demo where they had a model that was you know really amazing in terms of like what it was able to do it was essentially an AI agent that's quite similar to GPT 40 voice mode but there was just a little bit more of a delay than open ai's one and it was simply completely overshadowed by open ai's product release that was 2 days prior so what many people are now speculating including myself is that potentially on Tuesday we might get some kind of release from openai to overshadow what Google is going to release on the Thursday and I think that this is quite likely considering a Jimmy apples has a good track record of talking about air releases and them being very accurate and B the fact that there have been a variety of different Google models released in experimental format and I personally have enjoyed using them and I've been using them every single day so I think that this is not a surprise and considering opening eyes past trap record this wouldn't be a surprise if they do release something now the thing is the question is will it be GPT 4 large because the name GPT 4 large suggests that GPT 40 isn't a large model itself which would mean that the current you know iteration of models that we do have access to are smaller than we do think and this is where we

### [7:41](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=461s) GPT-4o model sizes

get into some more information so essentially right here you can see that this user in you know I'm just going to call it the strawberry account talks about how people are going to lose it when they realize GPT 40 mini is the 8 billion parameter model GPT 40 is the 7 billion parameter model and they're sitting on the 405 billion parameter model because their highly Quant sized goo is killing the competition with more still cooking so essentially I think here that you know GPT 40 mini being the 8 billion parameter model is plausible considering the fact that the model isn't as good but it does well enough for various use cases and the fact that the model is extraordinarily cheap now one thing that we have seen time and time again is that you know open AI have been a decent level above the you know standard level of research going on and the standard level of open models that we do have out there so if we do kind of you know take that into account to where llama if we look at for example where llama 8 billion parameter is the Llama 3. 1 and then we think okay if open AI are let's say you know 30 to 40% ahead of you know standard closed models and even open ones we can say that it is possible that we could get GPT 40 mini to be an 8 billion parameter model that that is you know definitely plausible considering um that version that we're currently using is just the text base so considering that they may have made certain breakthroughs or whatever and considering when we look at the five3 papers which are a series of um you know papers you know five1 all the way up to 53 these papers kind of look at how you know models can get better with you know certain techniques and how smaller models can be remarkably effective despite their size so I don't think that this is impossible and G PT 40 being a 70 billion parameter model would be a real shock because whilst yes llama 3 70 billion parameters is a remarkably good model and it does really well that would mean that open a have done some remarkable breakthroughs and that you know things are about to get even crazier if GPT 40 is a 70 billion parameter model and it's not out of the question it's not like you know so far like out of the question but it would be a remarkable feat for opening ey to accomplish that and he does say

### [9:58](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=598s) capabilities reminder

that this sitting on the large model because their highly Quant sized goo is killing the competition and this kind of does make sense in a sense um in the sense that like if you remember GPT 40 like people keep forgetting this I'm going to bring this up again because GPT 40 we don't actually have access to the full GPT 40 model one thing that people actually don't remember and I'm going to show you guys this web page cuz this is something that I keep forgetting as well and this is why I say people need to keep forgetting that openi are still in the lead and openi have spoken about exertive deployment is this okay if you're thinking why are you talking about GPT 4 o large this is a video about strawberry just trust me this is all going to link together just trust me it's in a moment so the 40 in gbt 40 actually stands for Omni and one of the things that people keep forgetting is that you can do a lot of things with a you know multimodal model and you know if you look at all of the crazy things here where you can take a logo paste it onto an image you know if you click C full sample you can see you know you can it can play audio it can do all of these things like for example there was you know character design you can uh you know design a character then you can have this character doing many different things consistent characters like having a model that can do all of this you know remember we don't have access to the model so maybe this is what openi is going to release this Thursday considering the fact that you know we've got a situation on our hands where we're seeing that you know the strawberry account is claiming that we just have the 70 billion parameter model so I mean considering the fact that we don't really have the audio we don't have the 3D stuff we don't have this kind of image Generation Um and if we look at some of these you know other things poster design where you can combine two images and it can make an instant poster these are still tools that we don't have access to so I think this is uh some of the stuff that people keep forgetting that is part of the GPT 4 Series so this is something that you

### [11:49](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=709s) Comparison with state-of-the-art models

know like look at the exploration of capabilities I'll leave a link to uh this video because uh I made a video on this I made a 30-minute video you know looking at all of this stuff but gbt 40 is actually a bigger model than we do think so this tweet actually kind of has you know a bit more credibility considering the fact that um GPT 40 is a completely you know omn model it's completely you know natively multimodal model now if once again we look at the 400 billion parameter models we know that 400 billion parameters in a model is achievable at current stage of the art because llama 3. 1 405 billion parameters uh the recent one we can see and I'm looking at Scales benchmarks here the reason I'm not looking at the arena benchmarks is because the arena benchmarks I wouldn't say they're contaminated but I would say that there's not shady stuff going on but just with how certain battles have gone certain models are higher on the rankings than they should be which doesn't give an accurate representation of how models will Faire in you know user evaluation So currently I'm just you know using this one because it's completely independent but we can see here that you know llama 3. 1 is right up there with state-ofthe-art with you know a marginable you know increase in terms of you know abilities over other models so like right now we are getting to that stage where plausibly having a model that's 400 billion parameters being complete state of the-art isn't out of the question at all considering llama's you know recent release now when we get onto the actual you know strawberry stuff um this is where we get into some of the stuff that I was saying before he

### [13:18](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=798s) "Strawberry" and OpenAI's tiny model claims

says you know a explained has been close to this for a while so I'd watch them for a cleaner take if you want to dig in but this is what IIA saw it's what has broken Math benchmarks it's more akin to reinforcement learning with human feedback than throwing compute at the problem sus column R is a very tiny open AI model using strawberry and strawberry in larger models comes on Thursday which is why I was talking about all the previous iterations of models and the kind of things that we're looking at now it says think of it as an LM fine tune to reason like a human hence why Sam likeed the level two comment and felt great about it and Ilia did not here we are now the reason I think this is so fascinating is because when we really take a look at what we're looking here you know if column R SAS column R the model that is currently on the arena area the essentially chat B area where you can actually test out models and you can see you know the frontier the next Frontier of models and how they perform if that model is actually like an 8 billion parameter model 7 billion parameter model and it's tiny and you know it's performing state-of-the-art in terms of just you know using whatever reasoning engine it's using that is a true breakthrough because it's able to reason about problems that you know some of the other questions didn't get right and like I said you know if we're actually you know testing these things on actual reasoning through you know steps and spatial awareness and you know linguistic benchmarks and actual reasoning you know things like not just a trick question did you get the trick question right I think that is a kind of thing that you know with added scale would be you know where you know AGI is you know showing us that okay this kind of model is quite like an AGI level system and maybe they just haven't scaled it yet so I think you know this claim that s colum R is a very very tiny opening eye model using strawberry it's not a bad claim because if the model was one that was you know a lot better in terms of the fact that it got every single question right then I would be like yeah this is pretty insane because like I said before when I was testing the model myself it was kind of just on par with the other state of the art it wasn't ahead you know it showed some genius in some of them but it wasn't that far ahead but if it is a tiny model that's just as good as you know these other models and that's completely insane and like I said before the way how the model did respond to me it did respond like it would it just wanted to reason about the problems everything just the way it like just internally I thought that the model was just trying to break down the problem in a certain way and it just didn't seem like other models it seemed like some kind of reasoning engine that was just genuinely my experience with the model when I tested it over a bunch of different

### [15:46](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=946s) Ilya Sutskever'

questions and now this is why um you know with a reference back to this of course you know hence why Sam likeed the level two comment and felt great about it and ear did not if this is all true like we do have a model that is able to reason at the level as good as some of the state-ofthe-art models now and it's a really tiny one um maybe this is why you know ASI is Within Reach because all they might need to do is of course scale this model so with that being said of course remember you know it was only recently elas sasov left and said that super intelligence with is Within Reach so I think that statement super Intelligence being Within Reach is a really big statement it's not like you know he went off to build AGI super intelligence which means that maybe you know considering what he's think thinking about he's probably thinking okay if open ey can get to you know AGI here then you know using some of their methods or some of the methods that I've worked on we can definitely get to Super intelligence so um and I think that you know one of the things as well is that you know IAS satova is not like a hype person he's not about building hype but if he's saying that super intelligence is within reach it doesn't mean that whatever they've worked on shows him that you know that super intelligence is not far off which is an insane statement to make so yeah that being a statement from someone that is you know one of the Lynch pins of open AI success means that we're not far off from AGI clearly if you know super intelligence is Within Reach now of course there was also this level two tweet which you know I've already referred to before but level two is basically human level reasoning which you know some people would say is Agi you know on the a AGI scale because I mean when you actually think about what

### [17:20](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=1040s) DAGI levels

AGI is most people would say that it's human level stuff and if we actually look at you know the actual levels the levels you know from 1 to 5 level five being you know someone that can do the work of an actual organization I don't think level five would be AGI I think when you actually think about it level five would be ASI borderline like Pine ASI because if a system can do the entire work of an entire company that's not something that one human could do even if I mean I guess you could but like when you really think about it it's more leaning towards ASI if anything if it's able to actually do that effectively um and run you know non-stop now um of course there is also this tweet which is where you know talks about huge model Sor voice video and safety stating that I've referenced some model sizing based on meta and Claude having small 8 billion 72 billion and large and this is a simple way to frame and means nothing except that a much larger version of four is coming and when you try it will be noticeable jump that when we saw from going to 3 to four and it arrives next week with strawberry now I like I said before I don't know if this is going to happen but I do think that if Google manages to surpass open aai in public perception they'll likely release something now one thing that I did want to touch on of course was safety so essentially this GPT next which you know we all know about they state that it is ready to go which does make sense based on the timelines um and of course there's you know variety of different things going on in the background but it says it's difficult to say if competition Will trump safety now

### [18:38](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=1118s) Safety c

it says the red teaming is finished and post training is done the model is such an enormous sleeping capabilities it's becoming impossible to make the model safe if you had this particular model unlocked you could easily disrupt the world on an unprecedented scale when you mix in voice vide Sora agents and the eye watering capabilities things heat up and they'll get the safety right and they'll roll it out I'm sure now I don't know if the world is going to be disrupted on a complete scale if this model is released GPT 5 um I think maybe GPT 6 but I think one thing that we need to remember is of course opening ice preparedness framework so the preparedness framework scorecard this is from you know their recent GPT 40 model card and this is basically where they you know evaluate how the model is you know the red teaming just how everything worked and basically um what we have here is their scorecard so this is why I always state that I don't know if we're going to get systems like uh gpt7 gp8 and people always like what do you mean we're not going to get those systems

### [19:30](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=1170s) OpenAI's preparedness framework

think about it like this okay if you have a model that is really sufficiently smart like it's really persuasive it's autonomous it's able to you know do a lot of things we have to presume that these like every single model is going to be jailbroken I don't think I've ever seen a model be released that hasn't been completely jailbroken and the problem is that when you release a model and it's actually useful the problem is that there's a fine line between a model being useless and a model not being able to be you know jailbroken because often what happens with the time is that a model sometimes the more useless the model is the less likely it is to be you know jailbroken and with open AI they have their models as products they want them to be as useful as possible and I've seen a ton of different you know jailbreaks before that are pretty hard to like get around in terms of like you know making sure it doesn't happen but new jailbreaks always going to be there the point is okay is that open ey have literally said themselves that you know only models with a post mitigation score of medium or below can be deployed and only models with a post mitigation score of high or below can be developed further so they're basically stating that look they are only going to release a model if it if it stays at medium if a model gets to you know High model autonomy High persuasion or high biological threats or high cyber security they won't release that model at all which means that even if we do get future models you know we're probably going to get models that are somewhat dumbed down because up I have said before that they're not going to be releasing these models based on their preparedness framework of course and this is just a safety thing so I do also think that as models get smarter and as they get more capable it's going to be harder and harder to you know tame them because you know they're just that much smarter in terms of their reasoning and it opens up the possibilities for you know widespread adversarial use so this is just an issue that opening have been facing regardless of what you know that Twitter account is saying so I think overall when we do take a look at this you know entirely there's a few things that we can take away number one is that Google is very likely to release something next week considering the fact that they had been testing you know Frontier models on the chatbot arena for quite some time now for I think around three weeks and this is

### [21:28](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=1288s) Summary and future predictions

exactly what open I did before they released gbt 40 I do think that the strawberry model is likely a very small model that is kind of like a reasoning engine because this is something that we've s we've seen from Reuters open I have confirmed that qar was something and Sam mman has in fact replied to this account and tweeted various times about this and I do believe that you know open ey will of course release something depending on what Google released and of course I do also believe that when we do take a look at the safety issues surrounding these next Frontier models it is very likely to be hard for ey to release these models considering the capabilities of them and I've said before that one way that they could potentially release these models is they could make it so that you need a license in order to operate these models the same way that you need a license to work with certain chemicals so that way if you know things go wrong with the model they're able to you know independently track what the model was used for what the chats were being sent basically just like way to actually track the models and still use them in certain industries where they're not going to be used for dangerous things but just a l level of accountability so that we can say okay we've released this you know extremely capable model but we're only going to allow it with people who have these licenses and we're going to track everything that's said with the restrictions of course but of course you're not just going to release it completely to the general public because you know individuals can do pretty much whatever they want without you know repercussions so I think that's probably what will happen in the future but I guess we're going to have to see so with that being said I think next week is probably going to be an insane week for AI as long as Google manages to release something which it seems like they are gearing up to so with that being said I am eager to see what happens next

### [23:00](https://www.youtube.com/watch?v=MG4z1LSEvqw&t=1380s) Conclusion

week and if I did miss anything in the video don't foret leave a like don't forget to subscribe and I'll see you guys in the next one

---
*Источник: https://ekstraktznaniy.ru/video/14137*