let's go straight into it. I've been using this non-stop for the last week now. I was lucky enough to have early access. And here are the big strengths and weaknesses. And I'm going to demonstrate each one of these for you to show you what I mean. So number one, it is the best problemsolving AI model ever. It is unbelievably intelligent. This is the best AI research tool I've ever used in my life. So I set it off to do some research on LLM for beginners. I said, "Hey, can you start researching machine learning contests for beginners and give me a detailed description for stone cold beginners? " Right? And it is going now and it has been working now I think for like 45 seconds and it is scouring the entire internet multiple websites dozens of websites at all at the same time. It's doing it step by step coming up with new concepts it needs to research and then researching like 20 websites for every single new concept it needs to research. There has never been a smarter model ever released. And it showed in the benchmarks I just showed you. But here's the thing though. When I get over to weaknesses in a second, I'll show you why AI is not just all about raw intelligence, but we're sticking to the strengths here. And when it comes to research intelligence, incredible. So, here is the final research report it built for me. And this took, I think, at most 3 minutes to do. This is unbelievable. This is basically a full in-depth research report with hundreds of sources it was built on top of that it took just a couple minutes to do. And on top of that, you can build entire websites based on this. So if I want to build a web page based on this research report, it will just quickly build a website and launch it for me to have all the information I need. Boom. And just like that, in under 30 seconds, it built an entire website with images that it generated for each section. Wow, look at these charts. Uh, that is unbelievable. And I can share this out. So if I wanted to share and export this website, put it in a Google document, share it out, I can do that. If I wanted to build a quiz, flashc cards, an audio podcast based on this, I can in one click get that based on the research report. This is the most powerful research tool I've ever used. It is also incredible at building prototypes fast. So, I have a basic test I do with every new AI model that comes out, and that is a 3D firstperson shooter test. I basically give it creative freedom. I say, make this the most stylistic, fun, and visually appealing game possible. Give it creative freedom and say, "Have at it. " And this was the best test I've ever given an AI model. This produced the best code I've ever seen. So in just that one prompt, it generated code in like a minute of this 3D firsterson shooter. I don't think you can hear it, but it has sounds for when the bullets fire, for when it hits enemies. That's a powerup. It has powerups. It has these great graphics. And most impressive to me is it actually puts like the gun on the screen. This is the first time in any of these tests I've ever done with models where it actually put a gun on the screen and bullets weren't flying out from random places in the middle of the screen. This is really, really impressive. So, generating prototypes, and notice how I say prototypes, I'll get more into that in a second. It is the best ever. And then media generating bar none the best. I made this thumbnail with it for my last YouTube video that came out. And so, basically, I took one of my older thumbnails. I said, "Change the text from it changes everything to I was not expecting this. " Perfect. Flawless. Nothing. Like just absolutely. You would never have guessed an AI changed the text there. Then I said, "Can you make the Grock 4. 1 logo bigger inside the owl image? " And boom, perfect. It made it bigger. It didn't ruin the image in the background. It didn't muddle anything. It didn't change the way my face looked. With a lot of these other AI image generators, when you change little details like that in the image, it will make all the people like look different. completely perfect. Nothing changed on the image or any other part of it. And then I said, "Can you change the background color from orange to any other color? " And it did it perfectly. Nothing ruined in the image. Text looks the same. Image looks the same. My face looks the same. It looks perfect. This is the best AI image generator of all time. It is incredible. There is literally no competition. And I guess that's what you would expect when you have Google, which is like Google image search, which is the mo the biggest image database in the world, as well as YouTube, the biggest video data database in the world. They nailed it. This is incredible. So, we went over the
strengths. We covered how if you're just solving black and white challenges, you need research, you need information, or you're dealing with media, this is the best ever. But this isn't going to be my daily driver just yet. And here is why. Let's get into the weaknesses. Not everything when it comes to AI is black and white. Not everything is kind of an equation to be solved. For me, my biggest use case when it comes to AI is actually creative writing and business planning. I use AI as my partner for everything I do at my business for all the creative writing I do. And this was actually not the best model I've used when it comes to that. Let me give you an example. So for business planning, every idea I come up with, I run through AI and I say, "Hey, build me a road map. come up with interesting features and show me how I can increase engagement. I expect an AI to be a helpful partner who I feel like I can trust when it comes to building that business. But there's a couple red flags that happen for me here. One is the vibes are just off. The way it talks to you is very AI. And I'm going to show you an example compared to Chad GPT 5. 1 in a second, but it just feels like the way it talks to you is very AI researcher. And the ideas it came up with for the app I'm working on, I'm working on an app store for solo built vibecoded apps. The ideas were just again very AI like. It recommended me implementing streaming into the app, which just streaming on a website is just like something not a lot of people are going to do. We don't need another streaming website for gamification and community governance. So it has weekly bounties for suggesting features. These are just it has vibe battles. So Tinder for apps. While these are interesting ideas, like human be people wouldn't actually use these features. These aren't realistic features people would want to actually use. They're very just kind of AI ideas. But if I compare this to 5. 1 thinking, which I think is the best kind of creative thinker, business planner model out there right now, right off the rip, right now you've basically spec. That's fine, but it's not defensible and won't drive retention by itself. It feels humanlike. It's pushing back on me a little bit. You need reasons to come back every day. And then the ideas it gave me were very strong. These are realistic ideas people would want. So a build log feed so people can see how apps update. Structured feedback requests, performance-based leaderboards. These were realistic ideas I actually implemented that people would use. It doesn't feel like an AI gave me these ideas. It feels like a human actually came up with the ideas. And so when it comes to AI, right, you have these benchmarks which measure like raw problem solving, raw code generation, and yes, this is the best. If you have measurable problems that are black and white, this is the best model yet. But there's a lot of these gray area things, creative writing, feeling like a human being. It doesn't quite nail that for me. The vibes, I think vibes are huge with an AI. I think if you're going to be talking to an AI for hours a day, it needs to feel warm. friendly. It needs to feel human. It doesn't quite hit that vibe test for me. For instance, I'm launching a vibe coding academy next week, which is just a community for vibe coders. And I asked for ideas. It gave me a few ideas, which is how to improve the community and course, which is good. That's what I asked for. It gave me exactly But we look at what GPT 5. 1 gave me and it just knocked it out of the park. It gave me a bunch of recommendations. It quelled my fears. It saw in my prompt I had fears about certain angling and positioning of the course and community. It went over pricing which is a very important part of this. It went over anxieties the customer might have and then it gave me an entire structure to ship with what I should have for day one and what I should position later on. It just went kind of above and beyond and thought like a human being. Okay, where might he have fears? Where might the customers have fears? I'm not just going to give him a few business ideas. I'm going to make sure he feels comfortable with this launch totally. And those are things I look for in an AI. Kind of extra mile vibe test where you just feel good and warm using it. I know that's not measurable. I know you can't really benchmark that. It's kind of ooey gooey, but for me, when I'm talking to an AI for hours and hours a day, I want the vibes to be immaculate. I want it to feel like I'm working with a human being. And for me, Gemini 3 falls a little bit short on that scale. So for me, this is the greatest research tool. problemolving tool ever. If I have a problem, if I need something researched, if I need an answer to a very black and white question, Gemini 3 is it. Couple other weaknesses here. It is pretty expensive. So, it is $2 input per million tokens. Output is $12. And that's if you're under 200,000 tokens, and it actually goes up if you go over 200,000 tokens. That is a little bit more expensive than GPT 5. 1, which is $1. 25 in, $10 out. So you are paying a considerable amount more if you're using this through the API. So not the most cost effective model out there, but I'm very confident they'll probably come out with Gemini 3 Flash very soon. And Gemini 2. 5 Flash was like the best cheap model out there. The best lightweight cheap model. So I'm sure Gemini 3 Flash is going to be incredible and probably the go-to for cheaper models. And here's the last part. No great coating harness. So, from a raw coding ability, Gemini 3 might be the best, but I'm actually going to stick with Claude Code for my longer, bigger projects. Claude Code is the best coding harness out there. So, it takes a very strong coding model in Sonnet 45 and makes it even better because it gives it really good instructions. Gemini 3 doesn't really have that kind of great coding harness. The closest you're going to get is with AI Studio. If you go to a studio. google. com, google. com. You can start building out prototypes here very easily. You give an idea for an app, it builds the prototype out. It builds the V1. But this isn't great for long coding sessions. If you're building a really complex app, I built out a really complex end toend app called Creator Buddy in 4 months with Claude Code and it was excellent. You're not going to get that here. I'm still waiting on Gemini CLI to improve or some sort of Gemini coding solution. It isn't there yet. But if you're looking to build out the best prototypes you can, Gemini 3 is the winner. Longer coding sessions, go claw code. Shorter coding sessions, I'm going Gemini 3 in Google AI Studio. So here is my updated list of