Massive AI News : Claude 4, Grok 3.5, OpenAIs New Model Revealed, And More...

43:32

Massive AI News : Claude 4, Grok 3.5, OpenAIs New Model Revealed, And More...

TheAIGRID 15.04.2025 40 857 просмотров 883 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Join my AI Academy - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ 0:00 Llama 4 chaos 02:45 Model switch controversy 04:00 Claude Max launched 05:00 Next-gen AI teased 06:00 Open-source surprise 07:10 Memory changes everything 09:30 Safety testing slashed 11:30 Dangerous capability delay 13:00 AGI getting closer 14:10 Dev productivity boost 15:40 DeepCode model drops 17:00 Browsing benchmark released 18:40 Google domination confirmed 20:10 Multimodal merger plan 21:30 Custom AI chips 23:10 Video editing magic 26:00 Microsoft’s AI glow-up 30:00 AGI in 5 32:10 App dev simplified 36:00 Midjourney v7 arrives 37:10 Text failure backlash 39:00 AI scientist succeeds 40:00 Real robot demo 42:00 Robo boxing future Links From Todays Video: https://x.com/midjourney/status/1908063247812481049 https://x.com/kimmonismus/status/1910709172125200655/photo/1 https://x.com/slow_developer/status/1911092611751850330 https://x.com/slow_developer/status/1911037146317619532 https://x.com/TheHumanoidHub/status/1910782422532391395 https://x.com/ai_ctrl/status/1910651407922774292 https://x.com/sama/status/1910363428180611454 https://x.com/reidhoffman/status/1909997012327575886 https://x.com/OpenAI/status/1910393421652520967 https://openai.com/index/browsecomp/ https://x.com/hckinz/status/1910068784129355822 https://x.com/UnitreeRobotics/status/1910323916012466354 https://x.com/sundarpichai/status/1910019271180394954 https://x.com/ns123abc/status/1910220590159442096 https://x.com/clonerobotics/status/1910029748614414533 https://x.com/itsPaulAi/status/1910023406692753788 https://x.com/ai_for_success/status/1909949067871793453 https://x.com/testingcatalog/status/1909725934846116209/photo/1 https://x.com/togethercompute/status/1909697122372378908 https://x.com/SakanaAILabs/status/1909497165925536212 https://x.com/OpenRouterAI/status/1909047932835512414 https://x.com/vitrupo/status/1908763535351669017 https://x.com/vitrupo/status/1908889997916467493 https://x.com/slow_developer/status/1908256921611886847 Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com Music Used LEMMiNO - Cipher https://www.youtube.com/watch?v=b0q5PR1xpA0 CC BY-SA 4.0 LEMMiNO - Encounters https://www.youtube.com/watch?v=xdwWCl_5x2s #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (24 сегментов)

Llama 4 chaos

So, this week of AI news is absolutely incredible because there has been a ton of drama. Starting off first with the Llama 4 catastrophe. And honestly, Meta, what happened at your offices? Because Llama 4 was the highly anticipated open-source model that we were all waiting on. However, it seems that something wrong happened. something occurred during the development stages that led to a complete failure and breakdown of the model. And there was this thing that was leaked months ago by individuals that were in the AI space, individuals that were working at Meta. They already said that Meta's Gen AI organization was in panic mode. And they said that this started all with Deep Seek V3, which rendered Llama 4 behind in the benchmarks. And adding insult to the injury was the fact that this company was unknown and had a tiny training budget. And I remember at the time that this information was released, many individuals, including myself, were pretty much stating that no, no way. Meta are really, really on their game when it comes to AI. They're spending billions of dollars. They've got some of the best talent in the world, and they've been at this for quite some time. So, how did things really break down? Because this was one of the biggest stories of the week. a billiondoll tech company failing to establish themselves with the next iteration of the AI model. That doesn't usually happen. I mean, something clearly went wrong there. And when we take a look, it seems that they released separate models, one for the benchmarks and one for public release. Ethan Mollik spoke about it. He said that the model that won in LM Marina was much different than the one released. He's been comparing the answers and they aren't even close. And the data is worth to look into and it shows how the Elamarina results can be manipulated to be more pleasing to humans. This was something that really shocked me because I thought that you know maybe just maybe the west was far ahead of where China was. But this really shocked me because it means that if Deep Seek were able to ruffle the fetterss of meta and you know have them trying to game the benchmarks potentially allegedly remember I released a full video on that. You have to understand that is a really big deal. You can see here that the release version was not as great as the previous version. Now, the reason I'm including this in this AI news video is because there was an update. That update was absolutely outstandingly just so shocking. Okay, take a look at this. I found this post on Reddit about how individuals who previously worked at Meta are no longer trying to affiliate themselves with Llama 4. Take a look at the statement we

Model switch controversy

have here. Someone is of course now working at OpenAI and they used to work at Meta and they state on their profile, I'm not sure what website this is, but it states that Llama 2 and Llama 3 period I have not been involved with Llama 4 at all. That is absolutely outstanding. Individuals clearly clearly do not want to be affiliated with Llama 4 by any means. And individuals stepped down. We saw some individuals, you know, leave that team. I mean, it wasn't looking good. I mean, hopefully Meta comes out and, you know, clears up the situation, releases the technical report. I mean, right now, it really isn't looking good for Meta, and I'm going to have some more information about that in tomorrow's video. Now, if we're talking about other AI companies, what are Anthropic doing? Well, they've actually released a new update that is the max plan for Claude. It's flexible with options for five times to 20 more usage than their PLR. Now, this is priority access to the latest and featured models. And I do like this because oftent times we run out of clawed and we're just sitting there waiting for our usage to be able to get refreshed. This is something that the AI industry has wanted for quite some time. Honestly, one of the reasons I probably don't use Claude as much as I should is

Claude Max launched

because I know 6 to 10 messages down the line, it's going to say, "Sorry, you can no longer talk with this model. " And so now you don't really have that issue anymore. And I know that of course sometimes there are issues with regards to how expensive the models can get. So maybe the average person isn't going to use this, but for some of you who are truly deep in AI, I'm pretty sure you're going to find this valuable. Now, this wasn't the only thing that Enthropic spoke about. This week, they actually spoke about Claude 4 and how that's coming along. Take a look at what they said here when it comes to Claude 4. The anthropic chief scientist Jared Kaplan says that Claude 4 will arrive in the next 6 months or so and the AI cycles are compressing faster than the hardware cycle even as new chips arrive and that post-training and reinforcement learning are accelerating progress that shows no signs of slowing. Take a listen because this is not hype. This is real stuff. I think that the generation time for models has been really fast. At least to me it feels fast and I think

Next-gen AI teased

that's basically going to continue. So I think that we should expect a new generation of quad models in not too long certainly the next six months or so and I think that basically that's going to continue and it's both because we're improving sort of post-training or reinforcement learning training claim more tests and because I think we're we're able to improve the efficiency and intelligence uh from pre-training. So I think that's not slowing down anytime soon. Um I uh I think in some ways the model cycle is even faster than the hardware cycle. Um we'll see if the hardware cycle is really one year but uh but it's definitely moving quickly and we're getting new chips uh sort of as we speak. Now we also had Elon Musk because Anthropic wasn't the only company speaking about their next iteration of Frontier models. Elon Musk stated in a live stream when he was playing a game that they are going to be looking to release their next Frontier models fairly soon. I honestly really do like Gro, so this isn't a surprise to me. I actually use that model fairly

Open-source surprise

frequently. I think it's definitely really underrated. So I wouldn't be surprised if their next iteration of models is outperforming the state-of-the-art because you have to remember they started way behind everyone and they've already caught up. So it wouldn't be surprising if they managed to leapfrog everyone in terms of AI capabilities. Uh Gro 4, yeah, later this year we've got Gro 3. 5 coming out soonish. That'll be upgrade. A lot of significant upgrades. Now, here's where we had Sam Alman finally, finally talk about open-source AI. This has been the very core message from OpenAI from the start. And recently, the giant elephant in the room in OpenAI was that they hadn't open sourced anything. They strayed from their mission. And this was something that kind of left the community in a salty way. And of course, remember how I spoke about at the start how Deepseek ruffled everyone's feathers in the AI industry? Well, I guess OpenAI didn't take too lightly to that. They're actually going to be open sourcing a model fairly soon. It probably could happen even today or this week, and I wouldn't be surprised if that does occur. OpenAI are taking their leadership very, very seriously.

Memory changes everything

oftentimes we see them fall behind other companies when it comes to you know staying on top but then every few months or so we'll see OpenAI release an update that will capture the mind share once again. So take a listen because for those of you who was thinking that OpenAI are just closed AI this should change your opinion. I think opensource has an important place. Um we actually just last night hosted our first like community session to kind of decide the parameters of our open source model um and how we want to shape it. I we're going to do a very powerful open source model. Um I think this is important. We're going to do something near the frontier. I think better than any current open source model out there. This will not be all like there will be people who use this in ways that some people in this room maybe you or I don't like. Um but there is going to be an important place for open source models as part of the constellation here. Um, and you know, I think we were late to act on that, but we're gonna do it really well. Now, another thing I forgot to mention is that memory is here and it changes everything. This is an episode that recorded specifically for my school community where I share prompts, but trust me guys, Chad GBT's unlimited memory is absolutely insane. Two of the standout features were that it remembers everything and you can ask it anything. Now, I basically spoke about the fact that there are only two ways to use this model. Literally, you can either take advantage of past interactions or you can tailor future interactions. Let me show you guys two quick examples. Number one is to leverage your past interactions by asking a question. So, this is the question that I asked it and no nonsense like I'm not even, you know, being clickbaity or exaggerating. I asked it this question and I feel like I've had some sort of major breakthrough. It says, "Based on our previous conversations, what recurring pattern is preventing me from reaching a specific goal? And for my specific goal, I said, you know, actually growing my business and doing better on YouTube. " And it gave me a really interesting insight that I've actually started taking on. And honestly, the progress has been outstanding. and it's only been like 2 3 days already. And I had another area for really good prompts and this is, you know, to leverage future interactions. One of the things I said was that, you know, as we continue working together, please challenge my assumptions when you notice inconsistencies with my stated goal of XY Z. So for you, it could be losing weight, running a marathon, or starting a business, whatever it is. And this one has just been absolutely incredible because it keeps me on track. Now, if you guys want all the prompts from that little slideshow and all the amazing life-changing prompts that I made, that's in the prompt section of my community. And don't forget to check out the academy because leveraging AI in

Safety testing slashed

today's day and age is almost an imperative. Now, one thing that isn't looking good for OpenAI is that the safety time is being slashed. I read this article in the Financial Times and it actually speaks about how OpenAI has been slashing the AI model safety testing time. Testers have raised concerns that its technology is being rushed out without sufficient safeguards. Now, this has been a problem for quite some time. But essentially, we can see here that OpenAI has slashed the time once again and resources it spends on testing the safety of its powerful AI models, raising concerns that its technology is being rushed without sufficient safeguards. Staff and third party groups have recently been given just days to conduct evaluations, the term given to tests for assessing models, risks, and performance compared to several months previously. Now, I think this is a key indicator of how quickly things are moving. Previously, if you weren't familiar with the AI cycles, we'd have this huge one to five year time period where you'd have to wait for the model to get trained, data collection, essentially this huge process to get the model out the door. And part of that time was the fact that they would have to safety test the model to ensure that it was safe. This safety testing oftent times would be around six to, you know, 5 months. And of course, other companies, they would try and cut down the time so they could rush the models out the door. And it seems like OpenAI, they are iterating on a feedback loop that is so small that they only now have days to test things. Now, I don't know whether that is because they are now using other AI models to test and verify whether the models are safe like they said they would, but I'm not sure if days are enough time to thoroughly vet an AI model to ensure that it is completely safe. oftent times many jailbreaks and things happen months after advance. But considering that the AI space is moving so fast, is this going to be the new norm? Control AI is a Twitter account that is dedicated to AI safety that I follow that keeps me up to date with all of the AI safety news.

Dangerous capability delay

They spoke about, you know, the fact that one of the people testing 03 said that we had more thorough safety testing when the technology was less important. And why are OpenAI doing this? The tester says that despite the increased potential weaponization of the technology, there's more demand for it. They want it out faster and it is reckless. This is a recipe for disaster. So, this is where the Financial Times identifies the competitive pressures as the driver of this behavior as companies are incentivized to cut corners to gain an edge on their competitors. Like I said, remember Deep Seek what they've done. They've truly ruffled the feathers and companies are now grinding harder to get stuff out the door so that they don't fall behind. You can see that currently this time isn't enough to test. One person who tested GPT4 told the Financial Times that some dangerous capabilities were only discovered two whole months into testing. That is absolutely crazy. So if you're given days to test a model, maybe the you know capabilities maybe the capabilities that are rather dangerous only appear months after they're out into the public. And of course right now these AI systems are really easily done. They're really easily built. One of the key things that people are worried about is the fact that in the future it's quite likely these models will have a lot more intelligence and a lot more capability to do things that could amount to real harm. Now, I think it's 50/50 because if that is the case, OpenAI may just never release those models. But even if OpenAI don't, are other companies going to do

AGI getting closer

that? And are we going to see this area start to be super weird? I mean, it's going to be super interesting to see how regulations evolve as the AI safety area starts to expand. And if we're talking about AI advancing, this is where we have AGI being defined. OpenAI are getting really, really vocal about how they view AGI in the real world. The OpenAI CFO Sarah Frier notes that Sam Olman already believes that AGI might be here, but we aren't fully utilizing the potential yet. She speaks about this in rather detail. So, I would listen to this if I were you. AGI in definition is that point where we believe AI systems can take on, you know, a majority of the real kind of value added human work in the world and do it. And we're getting pretty close to that being the case. If you ask Sam, he would kind of say it's, you know, it's imminent. We may be there. And it's also artificial general intelligence. It's not super intelligence, right? In fact, I know for sure we collectively as a world are not using it to its fullest extent. And so, you know, we would say we're getting pretty close at this point. Now, Sam Elman has actually been

Dev productivity boost

on a podcast recently. He's been on a world tour podcast once again talking about AI. And in this recent clip, he actually speaks about the fact that it's quite likely that we will get AI systems that allow the average developer to be 10 times more productive. Right now, the conversation that everyone is having is, will there be an automated software engineer? And there are some leaks that do show that is coming, but Samman doesn't really have that focus. He says their main focus is on making software developers much more efficient at what they already do. and he says that is probably likely to happen either this year or sometime next year which genuinely wouldn't surprise me. I think it's the degree of automation that matters to get to 100 truly 100% automation. You know, you can make a complex thing and never touch code. That that's one thing, but I'm less interested in that question than when a coder becomes 10 times more productive. And I think that could happen this year, next year. Now, if we're talking about models that are coming out of the door and how quickly things are moving, we have to talk about Deepcoder 14B. This is an 01 and 03 mini level coding reasoning model that is fully open-sourced and they are releasing everything, the data set, the code, and the training recipe. This was built in collaboration with the Aentica team and this is absolutely incredible. So it was released on the 13th of April

DeepCode model drops

and this was built in fact it was actually released on the 8th of April and this was built on the Deepseek R1 distilled has been specifically optimized for code generation and reasoning through distributed reinforcement learning. Now this one stands out because it is a tiny model that can code really effectively. It scores 60% on the live bench, 1,936 on code forces, and 92% on human evaluation, and 73% on the AME 2024. So, this is noteworthy because this 14B model is comparable to proprietary models like OpenAI's 03 Mini and 01 despite only having 14 billion parameters. Now, the exceptional performance comes from this model having, you know, really high quality data. Apparently what they did they had 24,000 unique coding problems from sources including Taco Verified, Prime Intellect and Synthetic One and Live Codebench and they used GRPO plus which is a reinforcement learning method where the model was rewarded only when it passed all tests for a problem. So if the model passed just some tests or even made one mistake, it received no reward for forcing it to focus on complete solutions. Now I think the you know most clear thing here is that the fact that this is fully open- sourced and this is of course going to allow people to build things that were otherwise impossible.

Browsing benchmark released

So whilst Meta struggles other companies are building on top of other things so it's going to be super interesting. Now in terms of benchmarks we actually had browse comp. This is a benchmark for browsing agents and it's interesting to see why OpenAI created this. Now, agents that can gather knowledge by browsing the internet are becoming increasingly useful and important. And they state that a performant browsing agent should be able to locate information that is hard to find and which might require browsing tens or even hundreds of websites in this process. Existing benchmarks like simple QA which measure models ability to achieve basic isolated facts are already saturated by models with access to fast browsing tools such as GPT40 with browsing and to measure the ability for AI agents to locate hard to find entangled information on the internet. We are open sourcing a new benchmark of 1,266 challenging problems called browse comp which stands for browsing competition. The benchmark is available in OpenAI's simple eval GitHub repository and you can read the research paper there. Basically guys, we are moving into a different paradigm. Browsing agents are completely different and finding information is now quite easy. What you need to be able to do is locate information that is hard to find. You know how sometimes you're on the internet and you're trying to find that really niche piece of information that you search on Google and you just know it's never going to return the response. That's the kind of benchmark that this is. And we can see here that this is something that isn't really great when it comes to Frontier models. We can see that GBT40 with browsing, GPT4. 5, opening out 01, they don't really perform that well. But Deep Research actually scores 50%. But I do wonder how other models perform because when talking about Deep Research, it's

Google domination confirmed

important to mention Google's Deep Research. And oh boy, have I been waiting for this because Google have completely shocked me in terms of what they've done. So Google, if you haven't been paying attention, have truly taken over the AI industry. They currently have the best models and right now we can see that the deep research area of AI is being dominated by Google on the benchmarks here such as instruction following, comprehensiveness, completeness, the writing quality. Deep researchers Gemini 2. 5 Pro experimental is seemingly topping the charts. Now, this is not the only thing that Google was in the news for. In fact, for a bunch of different AI announcements that are truly impressive. And it's time to take a look at just how great Google are doing. One of the things that they actually spoke about, and this isn't one of their announcements, but it's something that Google, I think they've been working on this for quite some time, but the Deep Mind CEO Deis has actually said that Google will eventually combine its Gemini and Vio AI models. So, essentially what they're trying to do, they're revealing their plans to combine Gemini and VO to basically build a more powerful multimodal AI. Geminy is already understanding the text, the images, and audio, while VO is the one that specializes in video generation. If you aren't familiar with just how good VO is, trust me when I tell you that is a model that truly understands video creation. It's no surprise considering they own YouTube, which is the largest

Multimodal merger plan

source of videos in the world. Now, merging those two is probably going to be pushing Gemini to becoming a real world digital assistant. And this is the broader trend that we saw with GPT40, the omni model. So, this is basically a model that can handle anything being input and can output anything. Now, Amazon is apparently working on an any to any system and Google of course is trying to do their same thing. Now, it's quite likely that we could get this model soon because if you haven't been paying attention, Google have some crazy AI chips. Recently, they announced their TPU progress. And we can see that they've increased their performance in Exoflops 3,600 times since 2018. With the 2025 Ironwood, we can see that this is absolutely remarkable. And I'll even include a clip of just how crazy this is right now. Heavily in this specialized hardware. And we continue to make massive improvements in performance and efficiency at scale. Today, I'm proud to announce our seventh generation TPU, Ironwood, is coming later this year. Compared to our first publicly available TPU, Ironwood achieves 3600 times better performance, an incredible increase. It's the most powerful chip we

Custom AI chips

have ever built and will enable the next frontier of AI models. Honestly, Google are surprising me here because they're showing us that they really, really have the AI industry on a hold. And the thing is that they're not relying on Nvidia chips to be able to do their AI inference. They're building all of this inference for themselves so they can not only train the best models, but also have these models thinking about difficult problems. So unless you paid attention to Google's event, you probably did miss it. And that's why I'll include a few clips now because it was, you know, a B2B event. It wasn't really a event that was all about AI, but nonetheless, AI was there, of course, showing us exactly what it could do as it's being integrated into many different things. Given where we are, we're going to use Las Vegas skyline as a perfect d backdrop for what we're going to do with Vertex AI Media Studio. So, let's go ahead and we're going to start by bringing in the Las Vegas skyline image. Really high quality, beautiful image. We're going to generate video. But here's the new hotness. Check it out. Camera presets built right into VO. panning left, panning right, timelapse tracking shots, and even drone shots. So, let's go ahead and submit a drone shot in drone shot of the city skyline. There we go. We'll go and submit this. Now, normally this would take a few seconds. I ran this earlier today, so it's cached, so it's going to be a little

Video editing magic

quicker than normal. All right, let's look at video number one. Absolutely spectacular. We have the ability to see the fountains, the Eiffel Tower. Now, let's go ahead and take a look at video number two. A different angle that VO creates for us. Again, stunning imagery. You can see the clouds in the background and look at the cars driving up and down Las Vegas Boulevard. Absolutely incredible. Now, one video is not going to do it for the concert promo we want to do. So, I want to show you some of the other videos that I created. I have one here of the stage being set up all through the power of VO. I have one of the band. I even audience actually clapping for what they're about to see. This will be a good reminder for all of you. Now, something very interesting happened. It turns out that VO can do something that my 12-year-old can do, and that is be an expert in photobombing. It turns out that this great video we just saw has a crew member, and we love our crew members. However, in this case, I'd like to feature the guitar because the guitar is the most important part of the band. So, let's go ahead and use VO's new inpainting capability. And I'm sorry, sir. I apologize. I know you're very good at your job, but I am going to have to remove you from this image. We will send flowers to you and your family though, sir. Let's use the new inpainting capability. Wait a couple of seconds and let's see what we see. Now, if this does what I think it does, it should preserve every single aspect of what we saw before, just without our stage hand. Look at that. Okay, so we got some video clips. Now, we need some music. Let's try the first clip I created with Lia and see how we like it. You know, that's not quite my tempo. I need music that's going to make all of you feel like I'm never going to give you up. let you down. I'm never going to run around and desert you. So, let's try clip number two and see how that works. All right, we have the recipe. I like

Microsoft’s AI glow-up

that tune better. We've got the videos. We've got the music. Let's pull it all together and see what it looks like. Here we go. Play it, Sam. Now, that wasn't the only AI company that had their announcements. Microsoft also had a major upgrade to Copilot. If you aren't familiar, Copilot is having major upgrades around the board to their system, and genuinely they are surprising me. Their tools are really easy to use, and the UI is something that is super, super simple. So, I can clearly see that they're marketing this towards the average person. And honestly, the way how it is, it just seems a lot more intuitive than chat GPT. Like, look at the user interface when doing the deep research. It's all very, very simple and it's all intuitive in terms of the UI. And one thing that they did mention was that the UI is something that they, you know, really bullish on in terms of developing that over the time because it's probably going to be generative and specialized for every single person. So, I'm going to show you guys this clip as well because it was super intriguing to see just how crazy co-pilot is becoming in terms of actually being really useful and not just a GPT rapper. So, this is how Copilot deep research really works. I go to Copilot, I select deep research, and I give it a topic. I'm planning a trip to Japan, and I'd love to learn more about the history and culture of MAJA. Help me create a travel plan based on that. Cool. So, now co-pilot will ask me some clarifying questions. We'll go back and forth and agree on a plan. Then it'll spend some time browsing and analyzing sources. I really love that I can see exactly how Copilot is working hard to research the topics that I'm curious about. I get a beautiful datarich report with graphics, tables, and insights that would have taken me days to gather and fact check. It has links to all the references, so I know I can trust it. That was really helpful. I didn't even think to visit Uji and Kyoto, but now I will. Now, I really need to figure out this apartment situation. I've been putting it off for way too long, and it's become extremely stressful. This place is just way too far from work. Help me find an apartment close to the main Microsoft office in Redmond. Copilot does all the work I needed to do in the background, searching the web for apartments and finding me nearby storage. This is amazing. I can even have C-Pilot fill out that tedious form for me now. Perfect. Just got a not from Copilot that the apartment tour is booked. What else am I procrastinating on? Well, I do have to write that letter to my landlord. Another product feature we're about to roll out is called C-pilot pages. That can definitely help me here. With pages, you can collaborate with Copilot in real time like an actual back and forth thought partner. I'll start by asking Copilot a question. I live in the Seattle, Washington area, and I need to write a letter to my landlord to adjust the terms of the lease. What's the best way to do this to make sure it sounds right? Then I can easily move Copilot's response into a page. And from there, I can refine it with my own ideas and style. So now I can upload a file directly to pages and combine the contents of my document with Copilot's response. And then I have the freedom to edit and organize however I want. Copilot Pages enables me to write and refine my thoughts in one dedicated place. That was easy and took no time. Like I was collaborating with a super smart friend. Now I need to sell some of my stuff. I've already taken the photos of the things I want to sell, but I just need to edit them before I post them. What's really neat about Copilot Vision on Windows is that Copilot can understand beyond just what I say. With my permission, it can see my screen like

AGI in 5

a second set of eyes. It's my sounding board and most importantly, it can respond in the context of what I'm seeing on my screen. Copilot, I need to edit this photo. I want to sell this chair and I need it to look good. How do I change the saturation? Don't worry. I got you, Dina. Just create a new adjustment layer down here. Oh, okay. I didn't know that was there. And you can change the saturation with this slider here. Thanks, co-pilot. That looks really good. Letter to landlord. check, apartment tour booked, photos ready to post, and Japan here I come. Thank you, co-pilot. And a big thank you to everyone there today for joining us. Bye. Now, Mustafa Sullean, the CEO/head of Microsoft AI, actually spoke on a recent interview where he said he can see a scenario where AGI is closer to 5 years. He said, "Despite the rate of progress on AI, fundamental issues like hallucinations and instruction following and memory still need to be solved. " So, I do think that this is an interesting clip that you need to pay attention to because whilst yes, it's fun to ride the AI hype news, there's also a bit of realness and rawness that these CEOs can't refrain. Progress over the last three or four years has been electric. It's kind of unlike any other, you know, uh explosion of technology we've ever seen. The rate of progress is insane. Open source is on fire. They're doing incredible things and every lab is, you know, every big company lab is investing everything that they've got in trying to make this possible. So, yeah, I could certainly see a scenario where it's closer to 5 years. I'm just saying, you know, instinctively to me it feels like there's still a lot of basics that we got to get right. You know, we still have to nail hallucinations. those citations I mentioned. It's still not great at instruction following. It still doesn't quite do memory. It still doesn't personalize to every individual. But, you know, we're seeing the glimmers of it doing all of those things. So, I think that we're taking steady steps on the way there. Now, we also had Google release Firebase Studio, which is where you can build any app in natural

App dev simplified

language, modify it, and deploy it all in one place. basically a free alternative to Cursor, Bolt, or Vzero directly in your browser. For those of you guys that have wanted to build AI apps before, this is going to be something that, you know, is remarkable in terms of its ability to do that on the fly. I know that oftent times building becomes very, very difficult and there are many different problems that occur when you're trying to use many different apps, but Firebase Studio is something that is remarkably simple. The UI is designed in a really nice way and I think even for beginners, you can use it to deploy simple web apps without any coding experience. I'll definitely be making a tutorial on this because I really like making videos that break down topics into, you know, really easy to digest, understand concepts that everyone can use on a day-to-day basis. So, this is going to be something that I'm going to explore. So, take a look at what Google have said about their Firebase announcements because, of course, I'd love for you guys to know. Really exciting things happened at Google Cloud Next 2025. First, meet Firebase Studio, our new cloud-based agency development environment that gives you everything that you need to quickly create production quality full stack apps. We've brought idx into the Firebase family and added a lot of new AI features. Now, you can prompt your way to a new app or you can start with one of our more than 60 available templates. Plus, with Gemini's built-in help, accelerate your time from idea to deploying and running that app. But that's not all. We also announced that we're providing early access to Gemini Code Assist agents. Now, these agents can help with all sorts of tasks like migrations, code documentation, and testing. To get started, join the weight list via the Google Developer Program. All right, let's talk Genkit. Genkit is designed to help streamline the process of building, testing, and monitoring your app's AI features and more. At Next, we announced initial support for Python and expanded support for Go. With Genket, you can access Gemini models, Imagine 3, additional models through Vertex Model Garden, plus self-hosted models with Olama. Speaking of model access, Vertex AI and Firebase provides a secure SDK to access models in Vert. Ex text AI from your client apps, including the latest models like Gemini 2. 0 multimodal live API, which enables you to create more conversational interactions in your apps. Here are two more exciting launches. Firebase data connect and Firebase app hosting are now generally available. Firebase data connect offers the robust reliability of Google Cloud SQL and Typesafe SDKs in Firebase. And when you're ready to deploy that great full stack app that you've been working on, look no further than Firebase app hosting. We even added some new features like being able to deploy with a single line. Yeah, we know it's pretty cool. This is just a preview, so be sure to check out the latest blog from Google Cloud Next for even more announcements. Links to everything I mentioned available in the description. We're so happy to bring developers the tools and platforms that empower you to build apps that your users deserve. All powered by Google Cloud. Now, another company managed to announce something that we've all been waiting for I don't want to say years, but in the AI industry, waiting 8 months or something feels like years. This is MidJourney version 7. This is where they're currently alpha testing their version 7 image model. It is the smartest, most beautiful, coherent model yet. Give it a shot and expect updates every week for the next two months. Now

Midjourney v7 arrives

it's quite surprising considering the fact that chat GBT released image and people are still using midjourney. I think it shows the importance of the fact that sometimes people use image generation services for one specific style. And often times with midjourney, that style is super realism. Even though chatb can do that pretty well, it pretty much is good at everything. The only thing that people seem to care about for midjourney is really hyperrealistic photos and mystical, you know, hyperrealistic photos, too. That is the main theme of what I'm getting. Now, something was quite disappointing about Midjourney, but I don't think the community seemed to care. Take a look at this. Someone said that they tried the same prompt with MidJourney with chat GBT4. Midjourney's chat text generation is still a complete fail for a team specializing in image creation that promised text generation with a V7. This is a huge letdown. My excitement for their video generation completely shattered. We can see here that midjourney for the text generation it completely fails. And for chat GBT40 we can see that the text generation is absolutely impeccable. Now this one is explainable by midjourney themselves. They actually responded to this and they

Text failure backlash

said that you know we prioritize features according to our community wide range voting which you can see here and text rendering was rated as one of the lowest value features after V7 is done and so they will do a new round of voting to update on what the community wants next. Now I don't really see many people you know complaining about this. Like I said one of the main reason people use MidJourney is not for text. It's not for infographics. It's mainly for just futuristic sci-fi realistic pictures that you can use in other applications. And it seems like, yeah, people don't really matter. Honestly, I did think that they would solve text, but now that Chat GBT has sort of cornered that area, maybe they're just going to go even further down the rabbit hole of the that, you know, that style of images. So, it wouldn't be surprising if that was the case. But nonetheless, the V7 I mean honestly I do want to say I feel like we're reaching that point of saturation because with images you can only go so far. So it will be interesting to see where midjourney does go. Hopefully during sometime in the future they may do video but for now we really don't know. Now we also got Dwark Patel talking about UBI in this interview. It was fascinating. I know for some long-term fans of the channel I used to speak about this quite a lot. So, it's pretty interesting to see how conversations are developing around the future as it ever inches closer. And another reason why UBI seems like a better approach than making some bespoke social program where you're uh make the same dialysis machine in the year 2050 even though you've got ASI or something. I am also worried about UBI from a different perspective. Like I think again in this world where everything goes perfectly and we have limitless prosperity, I think that just the default of limitless prosperity is that people do mindless consumerism. I think there's going to be some incredible video games after super intelligent AI.

AI scientist succeeds

And I think that there's going to need to be some way to push back against that. Again, we're classical liberals. My dream way of pushing back against that is kind of giving people the tools to push back against it themselves, seeing what they come up with. I mean, maybe some people will become like the Amish, try to only live with a certain subset of these super technologies. I do think that somebody who is less invested in that than I am could say, okay, fine. 1% of people are really agentic try to do that. The other 99% do fall into mindless consumerist slop. What are we going to do as a society to prevent that? And there my answer is just I don't know. Let's ask the super intelligent AI oracle. Maybe it has good ideas. Now we also got an update to the AI scientist which produced the first fully AI generated paper to pass peer review at a workshop level. This is really good because I remember the last time the AI generated scientist was released a lot of feedback was that it just wasn't good. oftentimes there's a

Real robot demo

lot of skepticalness or skepticism should I say that's the real word when it comes to AI because you know we're at the frontier of innovation when we're looking at these studies and essentially this entire thing is an AI that can come up with new ideas have them publish a paper and essentially have the LMS peerreview that paper and judge it. So, it seems like this is a massive win for Sakana AI Labs and I wouldn't be surprised if this is the future where we have multiple AI systems testing ideas for, you know, ways to improve AI systems and that's how the Singularity succeeds. Now, one of my favorite videos this week is this 1x Neo robot. And I don't think you understand just how impressive this is. for you to have a robot that does a live demo that shows you that you have a level of confidence about your product that no other company has. Most times when we see, you know, robots dancing around and doing crazy things, they've recorded it maybe 50 times, 20 times and it does it a few times and the one time that it does get it right, that's the video they upload to social media. But here we can see the Neo Gamma robot doing these tasks autonomously in a live environment. Trust me guys, that shows us that the singularity is nearer than we thought. We could have tons of these robots doing tasks autonomously around our houses, being able to do many different tasks that, you know, humans just don't want to do. Of course, this might not be cheap, but I think this is going to radically change the economy when these are deployed in massive numbers. So, this was something that I think you guys should listen to because of course, society is changing and this is at the forefront of it. What you see here now, of course, is just a subset of tasks that Neo can do. And this is a mix of autonomy for things the robot are good at is good at and some remote operation where someone's guiding the robot to basically do expert demonstrations on how to do these tasks. And as we have an increasing number of these robots throughout homes living among us and learning, more and more of this becomes autonomous.

Robo boxing future

until hopefully one day all of this will be fully autonomous. Neo actually has tendons that gets pulled very loosely inspired by human muscle. And this makes Neo into a robot that is quiet, soft, compliant, lightweight, safe, and really able to live among us and learn among us. There's never really been a time better for robots. We have an aging population in need of help. and we have a large labor shortage. Now, to end this in some more light-hearted news, we got the Unitary robot once again doing something amazing. And this time it was boxing. I've seen this robot do a remarkable number of feats due to reinforcement learning. And crazily, they state that there might even be a robot boxing arena very, very soon. So, if I see that, I'll definitely be talking about that on the channel. And this was something that was super surprising because I remember the exact day that I saw this robot released. It looks stiff. It looked stale. And I was like, okay, it looks pretty cool, but um where are we going to go with this? And you know, not 6 months later, I'm seeing the robot literally spar with a human in some kind of real steel/ cyberpunk vibe. So I mean the future is, you know, just really, really incredible. And this robot is able to get up, is able to stand there, throw a jab, throw a hook. I mean it's absolutely incredible what robots are going to be able to do in 10 years. I honestly cannot imagine what we are going to see.

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник