HUGE AI NEWS : Strawberry/QStar RELEASE DATE, Reflection 70B EXPOSED, New AI Agent SOTA

24:10

HUGE AI NEWS : Strawberry/QStar RELEASE DATE, Reflection 70B EXPOSED, New AI Agent SOTA

TheAIGRID 11.09.2024 23 511 просмотров 638 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Prepare for AGI with me - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ Links From Todays Video: Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (15 сегментов)

Intro

so with a crazy week in AI in terms of news let's take a look at the stories that you might have missed now one of the things that I think most people did Miss and even I missed this was this crazy technology so there's this new AI software from hu Ai and

New AI Software

essentially this is absolutely incredible so you can see here that this is a new text to video software but the prompt and the video that comes out of it is absolutely incredible so it says an over the shoulder shot of a woman in closeup At first she has laughing then she becomes sad then she starts to cry and then she covers her faith with her hand now I think this example is absolutely insane because this shows us exactly the kind of uh quality in terms of AI like I mean that's just wild like she's laughing and then we see that her emotion changes and she actually looks really sad and she puts her hands in her head I mean this is absolutely insane because I'm not sure if they upgraded the model or what not but this model wasn't as good as it is now when I last tested the model so this is a video model which is text to video which is rather fascinating because not only is this video doing the rounds on social media in terms of how realistic it is we

Text to Video Model

also have this video on Reddit that's actually showcasing just how good this tool is now I do want to state that I think the best thing that this model does do is that this model actually does Clips at the current rate now you might be thinking what do you mean that the current weight well currently for some reason a lot of these text to video models when they output videos they're usually in slow motion and this is largely one of the first times that I've seen an AI model actually use you know normal speed and it's kind of weird but it adds to the level of realism that we do see from these clips and I do have to say when I've been looking at these clips on social media I truly don't think it's going to be that long before we get full-fledged AI movies and AI TV shows and whilst yes that might not mean entire Industries are going to collapse I do think that it's going to mean that you know if you're someone who's an aspiring Creator filmmaker maybe you've always wanted to make your own TV show but you've had a limited budget I'm sure something like this is going to be something that can help you flesh out an idea or even validate the proof of concept so this is going to be something that's really cool and I do know as well that I'm thinking of the back end of this and I do think like many TV shows where they ended on a strange season or other things like that people might just be even picking up these TV shows in their own way I mean it's going to be a really fascinating future and I think this is one of the pieces of Technology where we can really start to see that you know the future is going to be super uncanny and many Industries are going to be ripe for disruption which is fascinating to say the least now there

Reflection 70B Model

was also this information regarding the reflection 70b model now if you aren familiar with the 70b model I do want to cover this quickly because I did make a video about that model and essentially in that video they basically state that we fine-tuned a version of llama 3. 1 70 billion parameters to basically have Chain of Thought combined with reasoning steps and it basically smashes everything now there's information on the internet that kind of suggests that reflection the API is just a sonit 3. 5 rapper with a simple prop and they are currently disguising it by filtering out the string clawed so essentially you can see here that when this individual uh says write the word clawed use plain text no tags you can see that when it says to write the word clae disappears now there's been a lot of criticism surrounding this model because many people haven't been able to replicate The Benchmark results and this has been so now there was some recent information regarding this and of course like I said before there were many different tweets that have been you know pretty much debunking the reflection 70 billion parameters and recently you can

Matt Schumer Apology

see Matt Schumer has come out here and said I got ahead of myself when I announced this project and I am sorry that was not my intention I made a decision to ship this new approach based on the information that we had at the moment I know many of you are excited about the potential for this and are now skeptical nobody is more excited about the potential for this approach than I am for the moment we have a team working tirelessly to understand what happened and we'll determine how to proceed once we get to the bottom of it once we have all the facts we will continue to be trans with the community about what happened in the next steps I'm honestly not trying to jump to conclusions here but you know there's a lot of speculation going on internet about whether Matt was just trying to you know hype up his investment in glaive AI cuz he actually built the entire thing and hosted it with glaive so some people were saying that you know he's got like a personal investment in that area so it was basically just good PR for him and I mean when you think about it did get a lot of attention currently it seems like the post is deleted but it did actually have over 9 million views So currently I'm just going to a wait more details until we know the facts but I got to be honest it doesn't look completely good if you're you know saying all of these benchmarks you come out with them and then everyone tests them and then they can't manage to repeat the results and then it was undisclosed that your investor in that platform it is a little bit Shady but I'm not you know writing completely off Matt Schumer I'm just waiting for more information regarding the story because you know he they do say here that you know at no point was I running any models from other providers as the API and these guys basically are saying you know that there's a lot of work to do so I mean it will be interesting to see how the full situation pans out but I will say that you know your reputation is something that you know you have to try and maintain and if this is something that turns out to be completely false unfortunately the reputation is going to be quite damaged considering the fact that these were big claims and everyone was truly excited for this kind of announcement now in probably the biggest

Strawberry Release Date

news of the video and trust me there's a lot we actually have news that open ey strawberry is actually going to be coming within the next two weeks so this is actually rather surprising considering the fact that openi has been delaying release after release product after product and pretty much keeping everything behind a weit list besides GPT 40 which even that model people still don't have access to the advanced voice mode so you can see here that it says strawberries you know open AI reasoning focused AI is coming sooner than before for open AI plans to release strawberry as part of its chat GPT service in the next two weeks earlier than the original fall timeline we had recently reported said t people who have tested out the model release timelines are always subject to change of course but we have a few other new details about the product we should explain that while strawberry is a part of chat gbt it's a standalone offering so that is going to be rather interesting because it seems that it's going to be kind of like the separate service that you can use and I'm wondering if it's going to be included in the subscription so you can see right here it says exactly how it will be offered is unclear one option is for strawberry to be included in the drop down of menu of AI models customers can pick from to power chat GPT and it's quite different to the regular service with some advantages and shortcoming of course what differentiates strawberry from other conventional AI is in its ability to think before responding but rather than immediately answering a query the thinking model the thinking stage usually lasts 10 to 20 seconds so I think this is going to be not a revolutionary change but a rather interesting element of the model because we're going to get a model that's potentially well largely going to be much smarter but people are going to have to wait a lot longer for their responses now I do wonder if this is going to be kept in terms of the output token context length like if this model is going to be able to code certain things or predict certain things because when you think about the kinds of reasoning that we're going to be using a strawberry kind of model 4 it's going to be things that involve many different variables and many different scenarios so I think the main thing that I'm going to be most excited about with regards to this model the one that takes 10 to 20 seconds which when you think about it compared to how long current models take to respond that is quite the eternity in AI time so that's going to be rather fascinating to see if something that can truly have incredible intelligence compared to other models and it says but there are other key differences for one thing the initial version will only be able to take in and produce text not images which means it isn't yet multimodal in the way that other open AI models are and as most large language models released today are multimodal this seems to be a noticeable shortcoming and the decision release it text only could reflect the pressure opening eye is feeling to release products as it face more competition now I do know that this is quite true because allegedly there is going to be some other models being released from other top Frontier Labs within the next 2 to 3 months as we manage to close out the year people are starting to realize that open AI is a behemoth but it's a giant that can be beaten with many different methods and strategies and we've all seen recently how Claude 35 Sonic has taken the range in terms of the model that people use for their most intelligent query so you can see it also talks about pricing strawberry is likely to be priced differently than opening eyes chatbot which has free and subscription pricing tiers we not exactly how strawberry will be priced but it will likely have rate limits restricting users to some maximum number of messages per hour with the potential for a higher price tier that's faster to respond according to another person with knowledge of the product so essentially it looks like once again like GPT 4 when it was released we might be limited to you know just 25 messages every 3 hours or maybe every few hours or so we are extremely rate limited now this is something that does happen quite a lot and I think that these companies should actually start to have higher tiers for people that just want to use these models based on the costs proportional

Strawberry Pricing

to their chat like I know that there are so many people that would actually rather than having rate limits just allow people to purchase tokens and then just continue their conversation so it will be interesting to see if they do incorporate something like that and I think one of the most important things that we're going to see is of course the pricing so I we do know that yes it's going to be higher price but one of the most important things I'm looking at is how much higher will it be priced at because what this is going to give us an insight to is to how expensive Frontier intelligence is going to be in the future because if we see that this model is maybe only double or maybe even triple and it's like 10xd performance then we're going to see exactly how price scales with intelligence the reason this is going to be so fascinating is because we know that the price of intelligence has continually started to drop and I'm wondering if these companies are going to Simply going in the other way considering the fact that we're starting to get to the point where these costs are starting to mount up to quite a lot now we can also see here that what's going to be fascinating is that it says that we would also expect paying chat GPT customers to have access to Strawberry first before it's released to bigger fre tier users I would expect the fre tier users to get this model probably 2 years later like they did with the other models but that's something that isn't really the main focus but what we can also see is that strawberry is also expected to be easier to use than GPT 40 for complex or multi-step queries currently customers have to type all kinds of additional words into chat GPT to get the answer that they want such as telling the chat bot to walk through its intermediate reasoning steps to arrive at its final answer otherwise known as Chain of Thought prompting and strawberry's capabilities are supposed to help customers avoid doing that or other hacks to achieve smarter results so basically what we have here is the model which is going to have an internal Chain of Thought or an internal reasoning steps and I think this is rather useful because many times what you have to do with these models is you have to iterate again and again in order to get your final output and this is something that not only is timec consuming it's sometimes quite difficult because there are steps that you miss and you might not get the best in terms of the intelligence what is you know truly possible out of that specific model so it's quite intriguing to see how opening eyes internal workings are going to be and that's something that I'm actually going to be talking about in the next part of this video cuz there was a research paper that actually shows some of these details so it says that this means that strawberry will not only be better at math problems and coding but also at more subjective business tasks like brainstorming product marketing strategies and as we previously reported this model will provide suggestions that are more specific to a user's company and more detailed like generating a week by- week execution plan now what could be a major red flag in terms of a warning for opening eye is that some people who've used strawberry the pro prototype have complained that it is slightly better responses compared to opening eyes currently released GPT 40 aren't worth the extra 10 to 10 20 seconds of waiting the person said so I do think that whilst yes when we start you know currently there's going to be 10 to 20 seconds over time that is likely to go down but I am wondering the fact that there are 10 to 20 seconds if that is going to remove the actual usefulness of this model because you have to understand the average user doesn't want to wait 10 to 20 seconds for anything to load but I do think that what openi might have a real success at is marketing the strawberry product as some kind of not god tier intelligence but higher level intelligence so people are truly waiting for the responses because they need reliability and they need creativity as well as the ability to perform complex math problems and of course the ability to actually perform better on coding now I know most people are talking about reasoning and this and that but the main thing that I'm going to be looking at is to see if this new story Model can actually code better than Claude 3. 5 Sonic as there is a single person I've spoken to that doesn't use Claude 3. 5 Sonic for their coding project so what that does mean is that likely within the next 2 to 3 weeks which is a very short time we could be getting a new state-of-the-art model that people could use that is going to be better than clae 3. 5 Sonet in terms of coding and I'm truly starting to wonder what are people going to be able to build if this model is marginally better or even largely better than clae 3. 5 Sonic because there are so many people that are now building software and applications with these platforms that they didn't have access to before

Open Eyes

so then we also had research published by Google Deep Mind and apparently this reveals open eyes strawberries approach so basically it searches at inference time which allows it to reason better so it says test time compute can be used to outperform a 14 times larger model now basically what they're talking about here is that if you give a model time to think the model's ability to respond in a way that is better just improved on an insane level like you can see test time compute can be used to outperform a 14 times larger model so it does seem like this is the approach that I is currently using in order to get better reasoning at their models and like I said before it seems like we are now favoring reasoning over speed because of course you would rather a correct answer more than an answer that is just extremely fast now we do know that you know for future models it's likely that this trend will continue that the frontier of intelligence will always likely be slower than the current level of intelligence which is available to individuals for free because we've largely figured out how those models works and there are open source versions and you know much smaller distilled versions so I think that if we're always looking into the Future these models are always going to take a little bit longer to respond because it seems like as well what they're going to be doing is searching over a large space of possibility so it seems like we're truly moving towards that stage where we're going to get models that are truly a lot more reliable and a lot more intelligent

Jimmys Predictions

now the legendary Jimmy apples has said some recent predictions that give us a insight to the timeline of Future model releases apparently we should have a four times model maybe still called 4. 5 which is of course the upgrade from GPT 4 apparently in October which would mean that we get this model next month and of course the big boy which is GPT 5 could be as early as December now I do find this fascinating that if we get a model like GPT 4. 5 just around the corner and then after that we then get GPT 5 in q1 that would mark an insane 3 to 4 months which is why you probably should be prepared for this time because this prediction of GPT 5 in around q1 to Q2 of 2025 is actually quite accurate you might be thinking what makes you say that well while back like a really long time ago and I say it was only probably a year ago when GPT 4 was released I actually looked at the timelines and then I gave a prediction based on that which I'm going to show you guys now and I probably

Timeline

should have included you know q1 dates or Q2 dates but essentially this was the timeline so of course GPT 5 did actually finish trading earlier 2024 so we know that GPT 5 is actually quite likely to be early 2025 or even possibly in December now I do think that GPT 5 might

Release Date Prediction

actually be released as early as December but I think it actually hinges on where whether or not the other labs can actually manage to produce a model that is somewhat better than what these current models have so right now we're still experiencing the fact that we don't have access to Gemini 2 claw 3. 5 Opus and those are the two models that are expected to Dethrone open AI so my big prediction here is that we're going to get a 4. 5 model maybe it's strawberry whatever you call it that model is going to be largely Leaps and Bounds over what is currently available and then when the other Frontier Labs release their models like Gemini 2 clae 3. 5 Opus open AI is probably looking to maintain their lead with the further GPT 5 rather than just offering a standalone version like the strawberry model so it would be seeming that we're going to be getting three fron model releases over the next 3 to four months of course I could be wrong it's just pure speculation but that seems rather likely considering the fact that openi doesn't currently hold the lead in terms of state-of-the-art models now if you're

Google Building Out Compute

wondering what Google's doing Serj Brin from Google is basically stating that Google are building out as much compute infrastructure as quickly as they can because there doesn't seem to be a limit on Dem mod for us we're kind of building out computer as quickly as we can and we just have a huge amount of demand I mean for example our Cloud customers just want a huge amount of tpus gpus you name it um you know we just can't we have to turn down customers uh because we just don't have the compute available uh and we use it internally to train our own models to serve our own models and so forth so I guess I think there are very good reasons that companies are currently building out comput at a fast pace um I just don't know that I would look at the training Trends and extrapolate three orders of magnitude ahead just blindly from where we are today but the Enterprise demand is there out there you know I mean they want to do lots of other things for example running inference on all these AI models applying them to all these um new applications um yeah there doesn't seem to be uh a limit right now and I think the fact that in this interview him saying the fact that they literally have to turn down their customers just goes to show how much demand there is for AI services and cloud computing so a lot of these companies that have Cloud Computing Services are going to be experiencing a really rapid rise in value because if Google who actually has a load of compute and saying that they have to turn down customers I can't imagine how some of these other companies are going to be picking up the slack well it's time to change now I

Its Time to Change

think one of the largest stories that most people did Miss is of course the repli agent which is largely one of the biggest stories that I think is going to change fundamentally the online space in terms of who can build what because

Repli Agent

basically what this allows you to do is this actually allows you to start building instantly rather than you know doing all the intermediate steps where you have to do a little bit of coding at least copy pacing the coding then the deploying but this literally just does it from start to finish so I think that this repli agent is truly 100% a complete Game Changer and I think as these things start to you know get better as the models start to increase their capabilities as the intermediate steps start to become more coherent and reliable I think people are going to literally be able to build things with a click of a button so here's a quick demo of the repli agent in action you can see right here that you know absolutely let me propose what we build it goes through certain steps and then after that you can see literally in 2 minutes you have an entire website here so this is why I say certain industries are completely right for disruption in terms of web design in terms of certain coders of course you're going to need you know software developers to maintain this stuff but I do believe that this industry is probably ripe for disruption more than any other industry currently because it seems like this one every single month or every you know at least two weeks we're getting increases in terms of these models capabilities for

Honeycomb Agent

example there was also ship fast with honeycomb so honeycomb is building AI agents for software engineers and I didn't even want to make a video about this because I feel like I'm just so uh you know not not like you know saying the same thing again and again but like every literal month I see another state-of-the-art software agent that is just better than the previous one in terms of planning in terms of coding in terms of being able to you know complete these software engineering tasks and whilst yes some individuals may state that this is no better than a junior software engineer you have to you know start to realize that you know how is that not affecting the industry if you know these entry-level jobs are getting taken with models that haven't even improved substantially and when I say that I mean yes models have gotten better over the last two years but the models have all sort of converged around GPT 4 level and the thing is that now we actually have Frameworks that are built around these models that continue to get better so it's not like we went on a huge another training run they deployed a new model and that is what we're waiting for in terms of the indication to see an increase in terms of AI model responses we're actually seeing an increase in terms of these software AI agents just through sheer Frameworks which is remarkable because it you know leads us to believe that once we get you know other models that the kinds of jumps we're going to see are going to be absolutely incredible so there's so much stuff going on I think the next few months are probably going to be the most intense for AI because not only are there going to be increases but I think it's going to set the tone in terms of the expectations for what future models are going to hold if you enjoyed the video don't forget leave a like subscribe all that good stuff and I will see you in the next one

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник