# Grok 5 Could be xAI's Biggest Breakthrough Yet...

## Метаданные

- **Канал:** Wes Roth
- **YouTube:** https://www.youtube.com/watch?v=rzJC-ngZ6CY
- **Просмотры:** 65,271

## Описание

Try Caffeine for free: https://cffn.ai/WesRoth
If you've ever had an app idea sitting in the back of your head, this is your sign to go and build it.

Check out my tweet where Elon gets Grok to roast GPT 5.4:
https://x.com/WesRoth/status/2033737643411050825
______________________________________________
My Links 🔗
➡️ Twitter: https://x.com/WesRoth
➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe

Want to work with me?
Brand, sponsorship & business inquiries: wesroth@smoothmedia.co

Check out my AI Podcast where me and Dylan interview AI experts:
https://www.youtube.com/playlist?list=PLb1th0f6y4XSKLYenSVDUXFjSHsZTTfhk
______________________________________________

00:00 xAI is rebuilt from scratch
04:00 CAFFEINE AI (sponsor)
06:29 GPT 5.4 Roast
12:25 Grok 4.20
22:58 Universal High Income

#ai #openai #llm

## Содержание

### [0:00](https://www.youtube.com/watch?v=rzJC-ngZ6CY) xAI is rebuilt from scratch

All right, so Grock and XAI, they're up to something. Elon Musk posted this today, but this weekend XAI will have three Grock build models in training simultaneously. Musk recently said that XAI was not built right the first time around, so it's being rebuilt from the foundations up. Now, 10 out of the 12 original founders departed XAI. They're citing burnout and the performance pressure from working at the company. Certainly, we saw Grock come out of nowhere. was one of the late comers to the game and catch up rather fast but right now only three original founders are remaining including Musk recently XI hired Andrew Milik and Jason Ginsburg from Cursor they're joining SpaceX and XAI and there's actually a number of people joining XAI am Baris they've also hired Deventra Chaplot so that's a founding engineer of the thinking machines lab if you recall that's a mira company that she formed once she left OpenAI. That person is also ex Mistral co-founder. So they're hired for model training likely to be working on Gro 5. So this is a Deventra and Elon Musk at the XAI I'm assuming headquarters. Why is all this happening? Why the sudden hiring spree? Really, there's one major goal that needs to happen. So Musk is betting that his aggressive rebuilding strategy and the unmatched infrastructure that they have currently because sure they recently merged with Space X and that's going to have some pretty exciting possibilities such as AI data centers in space. We've talked about that before. So now we have Grock and Colossus that's all under the Space X umbrella. So here Peter Wilderford is saying that Anthropic Google OpenI they're all kind of tied for the lead. Meta and XAI, they're each seven months behind. Then you have Moonshot, Deepseek, ZAI, Alibaba, they're each nine months behind. And Mistral is about 1. 5 years behind. No other companies are competitive. So I'm not sure if I agree with that. I use the XAI models quite a bit now for search. Grock 420 is specifically for real-time search, for real-time events, up-to-date events is my go-to. recently open claw if you're doing the sort of the onboarding process. You have multiple selections for what you want to use to search the web. By default, it was Brave from the beginning, but now you're able to use Google and the XAI API. So, I've recently added that. I've been testing it out cuz that recently went live. The ability to use API for Grock 420. It didn't launch with that, but recently it has become available. So, I'm testing it out. But before that, I was just typing stuff into grock. com to run the searches, and they've been very, very good. If you haven't tried it out, it's pretty good. Here for one of these searches that I did to find the latest on Grock, it did, I believe, 391 sources to support its findings. And usually often I see 200, 300 plus sources being used for whatever search you're running. This is specifically very important for real-time stuff because it searches X and stuff happens on X before it happens anywhere else. But here Elon is saying that XAI will catch up this year and then exceed them all by a long distance in three years. That you will need the James Web telescope to see who is in second place. The big focus right now is on figuring out how to make Grock be an excellent coder. Hiring the people from Cursor and other companies. This is going to be a big focus getting it on par with the likes of Opus 4. 6. Grock code. some of the previous versions they've been very good at kind of like the high volume cheaper tokens and on open router I think at some point they were most of the tokens that were being used so tons of people were using it but if you're looking on coder arena as you can see it's clawed you know in the top whatever five positions claude has been one of the greatest coders extremely strong at the coding task and I think most people prefer it Peter Steinberger the creator of open clot has been saying that GPT

### [4:00](https://www.youtube.com/watch?v=rzJC-ngZ6CY&t=240s) CAFFEINE AI (sponsor)

5. 4 for kind of like the codeex versions. They might be better for coding specifically. Do you have an app idea that you've always wanted to build, something that you know the world can benefit from, something that you think is important, but it would take tons of time and money to develop? Well, you no longer need a development team, and it can be live in minutes. Let's take a quick look at Caffeine AI, the sponsor of today's video. So, Caffeine is a platform we describe the app that you want to build in plain English, and it builds it for you. not a mockup, not a wireframe, a real working deployable application. You open up a chat, you tell it what you want, and in a few minutes you have a draft app that you can see and interact with. Let me show you how simple it is to build something cool. So, 12 minutes ago, I asked it to create an idle game that's also a productivity app. This is something that's been on my mind for quite some time. It started building. Took about 10 minutes. Here's the project specs that he came up with. Here's the Docker file, license, readme, build. sh. It builds everything it needs for the front end, for the back end, and when draft is ready, it gets deployed and I can see it live. Here it is. On the left, I have my active quests, aka stuff I have to do in real life, like review the ancient scrolls of Typescript and craft the morning elixir ritual. Send the raven to Lord Product Manager and polish the sacred shield of side projects. It came up with this on its own. That's actually pretty impressive. And on the right, we have our idle game of progression. We have farms and libraries, barracks, and workshops. I can upgrade it as needed and keep progressing in the idle game. In under 15 minutes, I have a full working prototype. I can start adding to it, implementing new features, testing it out, etc. All of that from a simple prompt. You don't like something, click on it and tell Caffeine to change it. Want to add email functionality, analytics, or a custom domain? Just ask. And people are building real stuff with this. restaurant websites, productivity tools, 3D maps, CRM systems, even games. One click and it's live on the internet with your own domain. Think about how insane that is. Something used to take a dev team and months of work. You can now prototype in hours and have it live the same day. Ideas that would have stayed stuck in your head forever and now becoming real validated apps. This is what I keep talking about on this channel. The barrier to building software is collapsing. You don't need a dev team. You don't need to manage servers or hosting or any of that. You just need an idea and the ability to describe it. If you ever had an app idea sitting in the back of your head, go try it. Link in the description and you can start building it for free right now.

### [6:29](https://www.youtube.com/watch?v=rzJC-ngZ6CY&t=389s) GPT 5.4 Roast

All right, back to it. Me personally, I've been having a lot of problems with GPT 5. 4. It affects me greatly. This is the first time where I think I like hate a particular model. I'm not talking about its capabilities. It's capable. I'm talking about like the personality. I hate it. I've posted this earlier this morning I believe. So oftentimes when I try to do research, I do it oftentimes through open claw. I use grock 420 for a lot of like the real-time searches. And before it was available in the API, I would copy and paste it to open cl say use this, use whatever else you can find, combine it into kind of a list of links with summary so that I can kind of go deeper and read both the tweets and the articles, etc., etc., start to kind of get an overview of what's happening because usually stuff is either breaking on X every once in a while the information. com comes out with some breaking information where like they're the first to publish it and I feel like rarely anywhere else. I mean there's Reddit hacker news stuff like that but that's usually not the most breaking stuff. Then of course there's the actual AI lab. Sometimes they'll post something on their blog, but it's one of those things where you really need that real-time search. But for deeper research where you can really go deep into studies, for example, for that I still default to various deep search skills in you know, Chad PT and Claude and Gemini. So for example, GPT 5. 4 Pro, that's kind of their deep research model. Opus 5. 6 six and you got to enable research I think they call it and also extended thinking potentially to get it to think longer about it and then for Gemini you have deep research is what they call it and I believe it runs on Gemini 3. 0 you know, right now currently. So, I'm kind of on a health kick this year, just trying to be a little bit healthier and, you know, maybe live forever. Knock on wood. And part of that alongside with AI agents, you know, just keeping track of stuff, keeping track of blood work, what supplementation I'm taking, what I'm eating, how much I'm working out, I'm having my whip that tracks my heart rate variability, sleep, etc., etc. And then I take kind of the main KPIs, key performance metrics, and you know, I plug them into this deep research with specific questions. How can I improve this? What should I do for this? Right? So, I try to put all of my not I don't say personal data necessarily, but stuff that's specific to me. If you have some DNA mutation, like the MTHFR mutation, then stuff that you're going to be recommended might be a little bit different than someone else. If you look at your blood panel, specific numbers on there might mean you have to approach things differently from everybody else, etc., etc. So, I put in all my kind of key details, who I am, some of the blood work, some of the other stuff that I need to know for context, and I just dump it into the various deep research. In this case, again, GBT, Opus, Gemini Deep Research. Most of them run for about 30 minutes. They have tons of data and articles and studies that they reference. and come back with just really good information, really deep research, deep level stuff. By the way, if you haven't seen the tweet, uh Elon responded to it kind of making it go a little bit viral. So, I definitely appreciate that. But my point here is that GPT 5. 4 is really, really annoying. And I've noticed it before when on my OpenCL I switched to GPT 5. 4 for a few days. And I've noticed it again here with these research projects. I was trying to think of what to call that attitude. I had trouble coming up with a name. I think the best way to say it's reflexively contrarian. Its first instinct, it's its first sort of response is to try to be contrarian and just to tell you that you're wrong. No matter what you're asking, it's like, first of all, that's kind of a stupid question. Let me first of all explain to you why you shouldn't even ask that. That's kind of like the attitude that it comes at you with. It's not like a very nice feeling when people respond like that, where you ask them, hey, you know, could vitamin D supplementation help with this? And they respond with like, well, could it? Sure, but here's what you need to understand. And then they go on some tangent. It just feels unhelpful. It feels like it prioritizes showing you what's wrong with your thinking, question, and not actually helping you solve the problem. Right? If you say like, "Oh, my house is on fire. " You'll say, "Well, while it's true that combustion is occurring, it's important to note that not all of your house is on fire. The garage, for instance, appears to be structurally intact. " And this is littered through everything. It's like these nitpicking like you didn't use this word in quite the right way. As an example, in one of the research questions, I asked like, could this particular thing could it cause OCD like symptoms? Right? kind of the idea of uh overfixating on something, not being able to just kind of like let stuff go as easily. So, if you're having certain symptoms, you can describe them as OCD like symptoms in order to help the model understand what it is that you're talking about. Here's the thing, it doesn't really answer the question. It goes on this long tangent about how you know what, you don't know if it's OCD. If you want to make sure it's OCD, you should go see, you know, a therapist, get the clinical diagnosis, etc. It's maddening because it's like, "No, I'm trying to get an answer to a specific problem. I'm asking you if this could cause this and you're nitpicking my words instead of actually helping me. " Now, I want to note that most of these questions were health related. Again, like I said at the beginning, so maybe this is just a work that appears for specifically health related stuff, but there's just a few interactions with this model that just set my teeth on edge. Again, I'm not talking about the capabilities. It's a very capable model, very interesting model, great coding abilities. The ability to actually use the browser to troubleshoot things is impressive. By the way, this is where Elon jumped in

### [12:25](https://www.youtube.com/watch?v=rzJC-ngZ6CY&t=745s) Grok 4.20

saying, "Please do a vulgar roast of the other AIS and the voice of those AIs. " I'm not going to show those roast on here. I'll link it down below if you want to see it. They are very vulgar and they do kind of capture each model's certain eentricities, if you will. What I found hilarious about this is literally a few days ago, two, three days ago, I visited a family gathering. A lot of family members joined celebrating somebody's birthday. And I have a family member who, you know, everybody might have had a few drinks. I think it was mainly wine. I don't think it was anything too crazy, but this gentleman, family member, he's in his 60s. He pulls out his phone and he starts taking pictures of people. And then he opens up grock, puts it into unhinged mode, and just asks it to start roasting those people based on, you know, their image. Apparently, Elon demonstrated this skill on, I guess, one of the Joe Rogan episodes. So, this thing was just unleashed on various unsuspecting members of my extended family. And if you've dealt with Grock in voice mode and in unhinged mode and if you just imagine it going after people based on like what they look like this was no holds barred if you understand what I mean. It got pretty bad. Now me personally I don't think I've ever laughed that hard in my entire life. I do feel that a lot of family ties have been damaged that evening. I mean, at some point he had to tell Grock to like tone it down a little bit, to pull it back a little bit because it went, you know, a step or two too far. Anyways, if you ever not want to be invited back to Thanksgiving or Christmas, this is how you do it 100%. Now, meanwhile, Elon Musk's XAI is recruiting high level finance professional including Wall Street bankers, portfolio managers, trader, and credit analysts to join its data annotations team. So the interesting thing here is a number of companies are doing this. Anthropic is reportedly doing this as well is they're getting certain specialized data sets or groups of people who are experts in a certain area to provide data for their chat bots for those specific use cases. So here for example high level finance professionals they act as sort of trainers for these chat bots. Now, of course, what we expect to see is for these things to improve in that particular area, right? You give it a bunch of expert finance people and it improves its abilities in finance. We've seen, for example, the end of 1. ai, so they're alpha arena where these models trade. Last season, it was stocks. Before that, it was crypto, but they trade in real time various securities, stocks, assets, whatever you want to call them. And it's real time. It's verified on the blockchain. So, as far as I can tell, it's I mean impossible to game. This is as good of a benchmark as you could possibly get. Or if you have a model that's doing, you know, 20% every month, that's hard to fake, right? If you're doing it with real money and the real stock market, like those results kind of speak for themselves. And noticed here, Grock 420 is really crowded near the top. So, in this particular benchmark, it's doing really, really well. So it kind of makes sense if you're hiring finance people who teach the model to analyze companies market sentiment then certainly you would expect that those models should in theory be better at investing in real time than the other models that haven't been trained for that specific task. Now, one of the questions for companies that are doing this, by the way, I'm pretty sure Enthropic is doing it too. I'm sure most of the labs are doing this in one way or another, but if the way to improve these models abilities in a particular categories to give it specific data for that category, does that get us to AGI meaning do are we able to just kind of across the board give it data across all the tasks and improve it in all the tasks? That might be impossible in terms of just having all those data sets for every given task. some of them might not be as easy to come by. But the big question I think is how well does this generalize to other tasks. So in other words, if the model is kind of at this level and we improve it here in finance coding and we improve it here in let's say accounting, we improve it here in Excel, here in math, does it only increase in that particular domain or does it kind of pull the overall capabilities of these models up in general across the board? It would be especially interesting, I think, to see with XAI's Colossus 2. So, it's the world's first gigawatt scale AI compute cluster, and I think they're scaling towards 2 G of power. I'm not sure if they're that they've hit that mark or not, but they're going to get there. And they're training Gro 5 on that Colossus right now. And so, this is what I think Elon means when he says, you know, just give us 30 years and they're going to be in the number one place. and the person in the second place, we won't even be able to see them. I think that will come from number one, scaling the Colossus super cluster, eventually expanding into the stars. Specifically, I'm talking about putting those AI data centers in a sun synchronous low Earth orbit. So, basically, they'll be kind of floating up there getting sunlight 24/7. Because they're getting sunlight 24/7, the batteries are not as important, right? So the kind of like the load that you need to shoot up into the orbit isn't as high. And recently we've covered Google's project sun catcher that showed that this idea is fairly feasible. It's surprisingly feasible. The only kind of major limitation right now is how much it costs to shoot stuff into low Earth orbit. They're saying that right now it's not feasible. But by the year 2035, I think they're saying that it's very possible that building out these data centers in space will cost the same sort of per units of energy as it would to build them on the surface of the planet. By the way, shortly after that is when we heard I think it was the world's biggest uh M& A merger in history, right? So that's Space X and XAI 1. 25 trillion. So, if these things keep scaling how we expect them to, if SpaceX continues to improve how cheaply they're able to launch stuff into space, by the way, that Google Suncatcher project, it specifically mentions SpaceX and kind of their what they call learning rate. So, it's their ability to improve the cost of launching stuff into space. So, they're projecting the year 2035 based on the progress that SpaceX specifically has been making. It's not the only company they talk about it, but their sort of estimates, the trajectory to the future is based on Space X. So, right now, this statement, I know it seems like a bold claim to be, you know, in 3 years, the number one by such a wide margin that you can't even see your competition without a James Web telescope. I mean, that's a bold claim. Here's the thing. Would I bet all of my money against it? Especially if you give some wiggle room in terms of how long this might take because again Google is probably giving you the more conservative estimate saying by 2035 it will be you know the same price as building them on the surface of the planet at which point it would be clearly better because there's no regulations no waste. It's clearly superior if it's just a dollar amount same dollar amount to build them out there. It's clearly superior. So they're probably kind of giving you the more conservative estimate. Elon saying three years. Again, I don't know what he's seeing. He owns the company, but I mean, I think you might say that's the more sort of aggressive estimate, but you know, between 3 years and 10 years, somewhere within that time, there's going to be more and more data centers in space. First of all, who is going to be which company the main provider of those services and setting those data centers up? The actual, you know, getting them into the low Earth orbit. Well, I mean, it's probably going to be SpaceX, right? And now that XAI is part of SpaceX, a lot of that compute and free energy, that sunlight, abundant sunlight from the sun is going to be used to power the Grock AI, whatever version we're on by then. So, notice they're saying that XAI is 7 months behind. I again, I don't know if that's 100% true. So if we take coding into account, XAI doesn't have a competitor to let's say the latest claude model, but the Groank models are very competitive on a lot of different metrics. For how fast they are, how cheap they are, they perform very strongly in general right now in terms of search. I got to say that's my go-to for again like on demand real-time search. If I want to do the deep research, I'm probably going to go with anthropic or Gemini. If I want like what just happened right now, that's going to be Grock 420. No question. No other model comes close. Notice on LM Marina, Grock 420 is number two for search. Very close to Claude Opus 4. 6 search. And this is 420 beta 1. There's been, I think, beta 2 and three released, which we we're not seeing yet. Grock Imagine is the top 10 for text to video. It's number four on text and again that's a beta 1 so it's one of the more recent releases so it's eight or nine points away from the top from claude opus 4. 6 thinking for image to video and video edits notice Grock is in the number one position. So my point is simply this that while this probably looks like a bold claim assuming that the whole AI data centers in space is not such a crazy idea and again Google one of the teams at Google verified it. They kind of figured out theoretically what would work what wouldn't. They're actually launching their first two satellite constellation into space this year 2026. So assuming Google's calculations is correct. By the way, Nvidia had the same idea. Elon Musk This isn't just Elon that's envisioning these things in space. A lot of people are going, "That's a really good place to put these AI data centers. " If that is the next big hurdle to keep 10xing these models, if you have to have those AI data centers in space in order to keep scaling, who wins on that timeline? to me in this situation and with those assumptions if they are correct then this could be a very realistic scenario on a maybe a little bit of a longer timeline 3 years seems very aggressive but again I have no idea I'm not the one launching these rockets into space in

### [22:58](https://www.youtube.com/watch?v=rzJC-ngZ6CY&t=1378s) Universal High Income

other news if you've missed this Carpathy dropped the Carpathy jobs where he kind of looked at all of the jobs in the US economy and scored each and every one of those on how exposed to AI they are and he visualized it as a tree map. As you can imagine, people lost their freaking minds. Lots of news outlets started reporting on it and Carpathy quickly took it down. So the average score across all jobs is 5. 3 out of 10. Software devs are 8 to9. Roofers are 0 to1. Medical transcriptionists cooked. Elon replied to this saying all jobs will be optional. There will be universal high income. A lot of people push back on this idea of a universal high income, how viable it is. Are we going to have it? Suddenly, it seems like a fantasy when you approach it from kind of how the world is right now. Like really, everybody's going to have just tons of money, won't that mean that no one's rich, right? If everyone's rich, is no one rich? The kind of the angle that I always try to get people to view this as to try to understand it as is don't think of it as like getting more money and more resources. This will be largely handled through the fact that automation decreases the cost for everything. So it's not so much the idea that you need tons of money. It's the idea that everything just keeps dropping in price to the point where even a modest income can be 10x your yearly kind of the stuff that you need in the food, clothing, healthcare, etc. Or you can imagine it as being 100x, you know, what you need to survive every single year. At that point, it doesn't really matter whether you have 100x what you need versus 200x what you need. And I'm not saying that's guaranteed and this is the bright future that we're going to stumble into. I am worried about kind of the short to mediumterm kind of that transition that we have to go through. There's a million things that can go wrong. But I do see an outcome potentially if we get things right where due to large scale automation, due to cheap intelligence, due to optimization of everything from the supply chain to how things get designed to just you know humans are the biggest cost in a lot of these goods and services. So, as long as automation, even when it replaces jobs, as long as those kind of drops in costs are passed down to the people, you could see a path to this being the case. I don't know if I would call it universal high income because people are going to sort of struggle with that concept. It's like really we just get free money. That might be difficult to imagine, but the flip side is most of the basic living expenses get so inexpensive that you don't have to spend most of your life chasing them. And maybe then it is possible with a small expenditure to actually provide some sort of a basic income for people that meets all their demands. And if they want to keep building businesses and accumulate more, they're welcome to it. Google is recently announced that they're hiring a chief AGI economist. This is by Shane Le, one of the co-founders of Google DeepMind. So there are people that are beginning to kind of look at this and see what options we have available if and when we have to make that transition. So again, if this is us now, I do envision someday in the distant future where everybody's happy and it's a post scarcity society, meaning that basically everyone can have at least call it like a middle class existence, you know, kind of like comparing it to today and you don't have to work 40 plus hours a week to sustain that. I believe that is possible in the future. How we get there, you know, that's a concern. Uh I don't know if we have all of the answers quite yet. Again, I am assuming that the need for human labor will go down. So, the demand for human labor goes down and the cost of everything of goods and services goes down as AI and automation goes up. Assuming these things are true, I think we can get this. Let me know if I'm missing anything. you disagree. A lot of the push back that I get is people say, well, the governments will never allow people to, you know, have that much power. Maybe, sure, it's possible. or the business owners, the billionaires will, you know, not allow that. Sure, possible. I'm just more approaching it from a perspective of it seems like there is some proposal, some theory. There's a way to make these numbers work. And I really hope we at least have some model, some theory about what this post scarcity, post jobs world looks like so that when and if people start losing jobs, we can at least show them the plan and say, "But wait, don't panic. we have an idea because if we don't have a plan, well, that's when things can go south. If you made it this far, thank you so much for watching. My name is Wes Roth.

---
*Источник: https://ekstraktznaniy.ru/video/20609*