# AI News : So 2027 AI Is Going To Be HUGE, Sam Altman reveals Key Milestones In AI, Googles New Model

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=tvMngWl6H1M
- **Дата:** 30.06.2024
- **Длительность:** 21:36
- **Просмотры:** 21,524

## Описание

Learn A.I With me - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/

00:12 SWE Bench
02:18 Gemma 2
07:43 OpenAI VoiceMode
09:53 2027 AI 
15:13 Sam Altman "All Of Phyiscs"
17:59 Chinas New Humanoid
19:08 Finding Mistakes With GPT-4


Links From Todays Video:
https://x.com/tsarnick/status/1806071104148271434
https://x.com/tsarnick/status/1806159860368814504
https://x.com/sambhavgupta6/status/1806189387778232667
https://x.com/tsarnick/status/1806479900033085551
https://x.com/TheHumanoidHub/status/1806033905147077045
https://x.com/clmt/status/1806342399347597589

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Содержание

### [0:12](https://www.youtube.com/watch?v=tvMngWl6H1M&t=12s) SWE Bench

AI Labs that shape how we view what's to come so one of the first stories for this week in artificial intelligence or I should say this day in artificial intelligence with the rate at the update to coming is metabot a GitHub native state-ofthe-art coding agent that scored 38% on the software engineering bench light where the previous state-ofthe-art was 33% now this is absolutely crazy because this I think is open source but this is absolutely incredible because it managed to beat Devon Alibaba Factory and IBM research and they've been you know using a new cognitive architecture that solves issues in it structured you know workflow so it first gathers the context then you know it plans and edits then of course you know you destructure the plan into individual edits and then you apply those can see right here of course you test and review you see if the plan is faulty you go back to the planning stage if you need more context you gather more context and then of course finally you submit so this you know small architecture I say small but this is actually really you know fascinating and smart you can see that they've managed to beat these stateof the-art now the reason that I think this is so crazy is because it was only a couple of days ago that we got this and this isn't to steal meant to bought to Limelight but it was just crazy that I remember making a video on this where it was literally Factory AI where I was like wow these guys managed to get 31. 67% on the software engineering bench light using their new infrastructure and already we can see that this is once again surpassed which goes to show just how crazy AI development actually is so this team essentially had months and months of trial and error and you can see right now this is a product that you can actually use and you are able to you know effectively use this to solve these issues which is pretty insane so I'm wondering just how much more software development we are going to be getting over the next years in terms of the future because it seems like literally every 2 weeks we're getting you know even though it might just be 2% if you compound that over the next couple of years it seems that the rate of improvement is going to be absolutely incredible now what's absolutely even

### [2:18](https://www.youtube.com/watch?v=tvMngWl6H1M&t=138s) Gemma 2

crazier is that Google dropped Gemma 2 which are two models one with 27 billion parameters another with 9 billion parameters and it's incredible because Google's Gemma 2 actually beats llama 3 70 billion parameters Coen 72 billion parameters and command R in the chatbot Arena which is pretty incredible and I say that but I don't think you guys are truly understanding that a 27 billion parameter model is beating llama 3 quen 2 and command R in the chatbot Arena now whilst yes this isn't on the actual you know objective benchmarks such as the MML I still think it's rather impressive that you know users may not realize what kind of model it is that they are speaking to and I don't really like this image because I have to turn my head to the left to be able to read it but what it does show is it does show the rankings of the 27 billion parameter model you can see that right here this is the 27 billion parameter model Google's Gemma 2 and then of course the 9 billion parameter model right here and it's pretty incredible because it manages to surpass all of these ones right here of course like I said before it's just on the subjective evaluations this isn't the hard benchmarks I'll actually show you the hard benchmarks they've actually got the hard benchmarks here but you can see that it actually manages to be on par with llama 3 but of course with the you know the 8 billion versus the 9 billion parameter models it actually does do a little bit better than llama 3 which is like I said really surprising considering that when llama 3 was released it was kind of this weird state-of-the-art model in terms of for its size it just performed the best in terms of how you know condensed they've managed to get the information so this is something that actually did kind of surprise me because whilst yes we did actually get the information on Gemma 2 before I didn't think it was going to be this small and this good so this goes to show that Google are actually shipping and like I said in my previous video Cinda Pai and along with Dem sabis they've managed to I guess you know maybe not yet turn Google around but what we we've seen is that they've actually managed to start shipping models now what's really cool about this is that this is actually open source so this is an open model that has outsize performance unmatched efficiency and cost savings and is blazingly fast across anything that you want to try it with so this is something that is really quick it's something that you can easily use and has broad framework compatibility so I think when we get to things like Gemma 3 and Gemma 4 we're going to see some remarkable improvements in terms of what they're able to do so I'm actually really excited for this now what's also cool from Google is that they've managed to get Gemini 1. 5 Pro to a 2 million context length window and code execution capabilities and GMA 2 are available today so you can see here at IO we announced the longest ever context window of 2 million tokens in Google's Gemini 1. 5 Pro behind a wait list but today we're opening up access to the 2 million token context window on Gemini 1. 5 Pro for all developers as the context Windows grow so does the potential for the input cost to help developers reduce the cost where launched context caching in the Gemini API for Gemini 1. 5 Pro and 1. 5 plus so this is pretty insane because we now have 2 million you know context length and what's crazier is that we haven't even seen you know other models like open AI managed to catch up to this however if you did pay attention to GPT 40's secret paper you'll know that they do actually have a model and I think it is gp40 with an a million context token length because on the secret you know article I'm actually going to show you guys this now if you actually scroll down to some of the you know exploration of capabilities and if we scroll down to the video one I'm not sure exactly where it is but if we just scroll down to lecture summarization you can see right here that you can actually open AI GPT 40 is able to take in a 45 minute video now I would presume that this is around 1 million you know tokens because this is pretty much the same exact length that Google did their demo video with so you can see here that whilst yes open AI actually haven't shipped this model yet it seems that we're you know moving towards an increased length in terms of context length across all major models however it seems that Google might have realized what open AI have done and then of course once again I'm not sure what they've done but Demis sais actually did talk about how Google have actually made some kind of breakthrough in order to be get to in order to get to 2 million context length they actually spoke about how you know they kind of made a breakthrough and of course that was what led them to this you know length that they got to before everyone else did so it seems like Google are really starting to ship and I think we can probably expect the most from Google over the coming months in terms of what they're going to be able to give us and I would think that you know Google can potentially take the lead in the next year and a half as long as they continue shipping and of course managing to prioritize their AI stuff because they do have an entire Suite of AI products that I personally do believe as long as they manage to deploy them safely they can completely control the AI market now

### [7:43](https://www.youtube.com/watch?v=tvMngWl6H1M&t=463s) OpenAI VoiceMode

there is also something that was rather fascinating a Reddit user Rosie the Creator got a sneak peek of chat gbt's upgraded voice feature that apparently sounds way better and actually generate background effects while narrating this actually means that like I said before you know the model that we're going to get is a truly capable one which means in order for open a ey to roll out this model they're going to have to do some serious infrastructure scaling which is why we haven't got access to this model yet right I got you Rie so picture this there's this small town right real quiet everybody knows everybody kind of vibe now there's this old abandoned house at the end of the street that everyone says is on it kids dare each other to go there but most just run away before they even get close one night storm roll in thunder lightning the whole shebang a couple of teens trying to be brave or just playing stupid decided to check out the house during the storm they sneak in through a broken window and start exploring with nothing but a flashlight and their phones for light as they creep through the dusty Halls they hear a noise from upstairs it sounds like footstep but slow and dragging they follow the sound like I said maybe a bit stupid and it leads them to a room at the end of the hall the door was slightly ajar creaking as it sways they push the door open and in the middle of the room there is this old dusty rocking chair just rocking back and forth on its own no win no string just moving the kids are freaked out but also kind of Mesmerize you know now I do think the model is trying to generate sound effects here but one thing that I did actually remember when looking at the exploration of model capabilities when generating sound effects they actually didn't sound that good I'm not sure if open a I just struggle to do that but considering the fact that this model is not just combining other models together it's actually natively multimodal meaning that you know from one input to another input that's essentially how the model works it's not like combining a vision system with another system it's just entirely multimodal in multimodal out so I'm guessing that's why the sound effects area isn't as good as the other

### [9:53](https://www.youtube.com/watch?v=tvMngWl6H1M&t=593s) 2027 AI

areas now anthropics CEO Dario amod says by 2027 remember that key date that I spoke about quite a bit about in many of my recent videos I'll be doing an entire video on this actual date but he said that AI models will cost up to a hundred billion do to train and will be better than most humans at most things which means that these next training runs that they're looking at are going to cost1 billion which does mean that they're going to have to receive a significant amount of investment from some other people and that also means that this you know this hundred billion is probably going to be going straight towards nvidia's bottom line which is very interesting considering the fact that many people believe that Nvidia isn't going to be a company that is around for much longer considering many other players are getting into the chip space but that's not the point of this is that we have a situation on our hands the next training runs after the 10 billion training runs are going to be100 billion I've said this a few times but you know back in 10 years ago when all of this was kind of Science Fiction I used to talk about AGI a lot I now have a different perspective where I don't think of it as one point in time I just think we're on this smooth exponential the models are getting better and better over time um there's no one point where it's like oh the models weren't generally intelligent and now they are I just think you know like a human Child Learning and developing they're getting better and better smarter and smarter more and more knowledgeable and I don't think there will be any single point of note but I think there's a phenomenon happening where over time these models are getting better and better than even the best humans um I do think that if we continue to increase the scale the amount of funding for the models if it goes to say 10 billion so now a model would cost what 100 million uh right now 100 million there are models in training today that are more like a billion right um I think if we go to 10 or 100 billion and I think that will happen in 2025 2026 maybe 2027 um and the algorithmic improvements continue a pace and the chip then I think there is in my mind a good chance that by that time we'll be able to get models that are I mean this is something that you know Leopold Ashen Brer actually spoke about if we actually take into account every single thing that contributes to an improvement in AI models not just the ordinance of magnitudes in terms of the scale but the algorithm improvements the inference improvements the chip improvements and everything that we can manage to do as well as you know the agentic workloads SL Frameworks that we can use with these models it is going to be no surprise when these models are truly capable in 2027 after the100 billion supposedly training runs now I'm not sure who's going to be funding that or where you're even going to get the money to train a model for 00 billion because that amount of cash is just not even liquid at most companies unless you're Apple but that is uh something that is going to be pretty incredible and I'm wondering how much we're going to be getting out of these next models that are a billion dollars or even just $10 billion which is just a staggering amount now Dario amade also said something about scientists accelerating discoveries in biology and curing diseases and this is something that we've actually seen already with the likes of Google's models so I wouldn't be surprised if in the future we get specialized models that are literally just focused on breakthroughs regarding science let's say those models get to the point where you know they're kind of you know graduate level or strong professional level think of biology and Drug Discovery think of um mod model that is as strong as you know a Nobel prizewinning scientist or you know the head of the drug Discovery at a major pharmaceutical company I look at all the things that have been invented you know if I look back at biology you know crisper the ability to like edit genes if I look at um you know CTI therapies which have cured certain kinds of cancers there's probably dozens of discoveries like that lying around and if we had a million copies of an AI system that are as knowledgeable and as creative about the field as all those scientists that invented those things then I think the rate of those discoveries could really proliferate and you know some of our really longstanding diseases uh you know could be addressed or even cured I do think that this is going to be something that does happen AI is actually able to accelerate the rate of discovery whilst yes this is something that is quite futuristic and hard if we do take a look at what AI models able to do and if we're able to just look at fundamentally what they are I do think that you know these discoveries are going to be made sometime in the future if we do look at what we were able to do with of course Alpha fold that was able to accelerate the discovery of those proteins by you know not a million years but like a huge amount of time and I do think in the future we're going to be getting models that are going to do that exact thing they're going to be able to experiment a million different ways and a million different you know ways that we couldn't even think about so it's going to be something that is truly fascinating as discoveries continue to happen and I think we just need to get to that inflection point where we get a model that can continuously do that all we need to do is just put all our compute into that and then of course those discoveries will be made what's even

### [15:13](https://www.youtube.com/watch?v=tvMngWl6H1M&t=913s) Sam Altman "All Of Phyiscs"

weirder was that Sam Elman in a recent live interview actually spoke about this same phenomenon although he actually got a lot of flack for this but I'm going to explain why he shouldn't have uh I think unlike other sort of we're aware even if today we're like okay chat gbt is you know one I use it I'm not scared of it there is a sense of super understandable anxiety about where what does it mean if these tools keep getting more capable they've been getting capable at and there's tons of wonderful things and we could talk about those all day but there is this what is the future going to look like even if we solve every safety problem you know misuse figure out the perfect regulatory regime like what are our lives going to be like when it's not just like the computer understands us do these things but we can say like hey computer like discover all of and it can go off and do that um what does it mean when we can say like hey start and run a great company it can go off and do that so that's a big change I think people fundamentally misunderstood what Sam mman was trying to you know say here I think one of the first things is that some people just looked at this and they said oh my gosh why would he state that an a model can solve all of physics that doesn't make any sense then clicked off but I think what Sam mman is trying to State here is that there is a lot about the universe that we simply don't understand and if we do get to a future bear in mind a very far future to which we can say look AI model this is physics whatever your understanding of physics is we need you to just simply just focus on making breakthroughs and focusing on what we don't understand because there's actually quite a lot there are actually going to be quite a lot of Paradigm shifts in the future when we start to discover certain things and what they kind of do because there's this constant debate about whether or not we know exactly what we know and our understanding is always kind of Changed by one brilliant mind or just a few Brilliant Minds over the course of a few decades which fundamentally reshape our understanding and world view of the world like you know for example we thought the Earth was flat we didn't know about you know the microorganisms that live on our skin and inside of our bodies there were just all of these kinds of things that I can guarantee once we manage to discover them whether or not it's going to be simply an AI system continually making game-changing discoveries which I think will be very hard to do considering its physics and you know AI is fundamentally software so unless it and have like a simulation that can onetoone map stuff which probably could happen in the future I think of course this is going to be something that's extraordinarily difficult but I do think the fundamental you know point that we are going to be using AI systems to research and make breakthroughs is going to be something that whil yes now seems crazy just remember a lot of the things that we even see in today's day and age does seem crazy itself there was also a Chinese robot toy maker Liu robotics launching a fullsize humanoid robot kfu and it's integrated with

### [17:59](https://www.youtube.com/watch?v=tvMngWl6H1M&t=1079s) Chinas New Humanoid

huawei's multimodal llm pangu to understand natural commands planed tasks and execute with by manual coordinate coordination I would follow the humanoid Hub because it's actually a really effective resource for just humanoid robots is something that I actually stay up to date with when I'm looking for certain humanoid robot news that I otherwise wouldn't have found and you know Niche news like this like these Chinese robots is always interesting because it shows us what China is doing and if you're wondering why China putting out so much in terms of the humanoid robots Number One China are very effective at producing factories and building stuff like their you know ability to execute when it comes to factories and stuff like this is just absolutely remarkable so I think what we will see in the future especially considering the fact that China have did this kind of initiative where they've kind of incentivized these companies to make a lot more humanoid robotics I do think that this is going to be something that truly gives us the to how the future is going to be because we're likely going to see China expand their Horizon expand their Fleet first and then of course the US is probably going to execute on another level as well we also got this from

### [19:08](https://www.youtube.com/watch?v=tvMngWl6H1M&t=1148s) Finding Mistakes With GPT-4

openai and I think a lot of people are getting mad at open AI when they do Tweet stuff like this because the problem is that open AI didn't tweet stuff before and usually when they did Tweet stuff it was usually like a major update so now when people see tweets and updates and blog posts and it's not GPT 4. 5 not GPT 5 a lot of people do to get you know I guess you could say a little mad but they're working diligently to produce the GPT 40 voice model but essentially they State here finding GPT 4's mistakes with GPT 4 critic GPT a model based on GPT 4 writes critiques of chat gpt's responses to help human trainers to spot mistakes during reinforcement learning with human feedback basically they trained the base model GPT 4 called critic GPT to catch errors in gpt's code output we found that when people get help from critic GPT to review chat GPT code they outperform those without help 60% of the time we're beginning the work to integrate critic GPT like models into our RL HF labeling pipeline basically this is recursive self-improvement with AI with humans in the loop I guess an extended form of recursive self-improvement they do highlight here that as we make advances in reasoning and Model Behavior chat GPT becomes more accurate and its mistakes becomes more subtle and this can actually make it hard for AI trainers to spot their inaccuracies when they do occur making the comparison task that powers the powerful rhf much harder and this is a fundamental limitation of rhf because they make it increasingly difficult to align models as they gradually become more knowledgeable than any person could that could provide feedback so basically they're stating that look if you have a dumb model it becomes easy to provide human feedback because you need only a human that's smarter than a dumb robot in order to do that but as these models get increasingly smarter you're going to need increasingly smarter humans and there's an increasingly smaller pole of humans to actually provide feedback from and at what point do you just start using the model to improve itself if those subtle mistakes are things that we can't even find so this actually brings about a bigger question you know when are we going to get to that recursive self-improvement with rhf and of course I'm not stating that that's how recursive self-improvement works I'm just stating that I think we're going to eventually get to a point where we're probably going to use AI models to evaluate AI models and that be a surprise but we can see here that this is something that is currently being used and is currently very effective

---
*Источник: https://ekstraktznaniy.ru/video/14213*