That's the whole workflow now. This update connects three different worlds. Design lives in Stitch. AI agents live in tools like cursor, anti-gravity, claude, and Gemini CLI. Your actual code lives in React, HTML, Tailwind, whatever you use. And now all three of those worlds talk to each other automatically through MCP. This is not just
Поиск по транскрипциям
here's where it gets really interesting. This isn't just about cost. In programming challenges, it dominated the ADA Polyglot benchmark with 71. 6% beating Claude for Opus in multi- language coding tasks. And it processes information three times faster than its previous version. Now, let me be clear. Deepc 3. 1 isn't the best at everything. Different
comparing GPT-5 against Anthropic's Claude models when it comes to coding in real-world production codebases. I find that whilst many models do well on these one-off applications, like this French Adventure game that they made in the livestream, that performance doesn't necessarily carry over to coding in a real production codebase. The codebase that
this quick video, I'm going to cover what makes Opus 4. 6 different, and I'm going to test three use cases live. Using regular Claude for podcast postprouction, using claw code to generate a game, and using codework to generate a presentation. So, let's dive in. Okay, so what's new with Opus 4. 6? There...with Claude, it can get lost and ignore your instructions, but with Opus 4. 6, it remembers your initial prompt, even if you're deep into the thread. I'll demo this in the cloud use case next. Number two, it gathers context before acting much more. So, one of the criticisms of anthropics models, especially for coding, is that
work with apps button gives you control of the app you're working with is this better than what was already out there honestly I think Claude 37 is still better for coding but if you're on a budget chat GPT might be better because you're using the app version which doesn't cost API credits
watch and engage with. But it was still like, I remember someone asked me back then, like, "How did like random button presses do in comparison? " And Claude was like this much better than pressing random buttons, you know. Like it was better but like, not a lot better- - Yeah. - than pressing random buttons. And so again, I like tucked...realized 3. 7 on it was way better, I was going through watching it play and I realized there was this like terrible bug in my code where I wasn't showing Claude all of the information it needed to play the game. - Oh, wow. - I had this like thing at the time that was helping like show
really interesting. OpenAI is positioning this as their answer to the competition. Google and Anthropic have been eating their lunch lately. Gemini 3 dominates multimodal tasks. Claude is trusted by enterprises for code and reasoning. Open AI needed something to reclaim leadership and garlic might be it. This isn't just about being the best chatbot. This is about
right now and run it on your own computer for free. No API costs, no subscriptions, nothing. But here's the crazy part. This thing beats Claude Sonic 4. 5 on actual coding benchmarks. It beats Google Gemini 3 Pro. is literally outperforming the paid models that everyone's using right now. And I'm going to show you exactly
using an AI to create a game. There is something that you can do right now. You can ask a chatbot like the new Claude 3. 7 to write the code for you for a snake game that is self aware and even does unexpected things. Like escaping the matrix, and more. And it gets better, you know
early field of AI was poker bots. Are you seeing any good AI market makers? When you say no one's read all these tax codes, I mean, no one except Claude. That's fair. We should ask. We are seeing more, increasingly more people using agents to trade. So that's definitely— Especially on the API side
that link in the comments and description you now know what are the best free AI alternatives to Manis convergence AI coming in first Claude coming in joint first for coding and then open Manis and then after that out so thanks so much for watching and if you want to get a free one to1 SEO strategy session feel
Gemini 3 outperforming Gemini 2. 5 which outperformed Gemini 2, etc., etc. If you are being extra cynical, you may wonder about benchmark maxing where the performance in coding and mathematics and other benchmarks that are known to be highly publicized might be maximized to the detriment of the core parameter count and general knowledge. You could say general intelligence...much easier and cheaper to serve to hundreds of millions of people. Just purely my personal opinion, I will say that despite this simple bench result, Claude Opus 4. 5 is my coding go-to model at the moment. Now, you guys may wisely conclude, well, the best model is just the one that's best for my use case
what can you actually use in 2026? There are five categories of AI tools. Language models, Chad GBT 5. 2, Gemini 3, Deepseek 3. 2, Claude, Grock, Everything Text, Writing, Analysis, Coding, Research, Image Generators. Nano Banana Pro is the 2026 standout. consistent characters, 4K output, serious precision. Video generators, Sora 2, VO 3. 1, Clang 3. 0, cinematic clips
this option and this option are the ones with potential but it's very hard to do this kind of things in tools like loverable or claude. ai AI or even cloud code, right? It doesn't really design that way. How I solve this is that I like to use a tool called magic patterns. There is a feature
Manis using? It's kind of like a wrapper, right? So, it can use different tools such as like browser use, etc. Tends to use Claude, I think, as a main coding API. I'm not 100% sure what API is using for video generation, but I think it'll be using Cling. And then here's another example
data from multiple sources, processes it, and sends personalized emails to members. With Claude Opus 4. 6, you could drop in all your existing automation code, your email templates, your database schemas, everything. Claude would understand how it all connects. Then you can say, "Build me a new workflow that does X, Y, and Zed, and it would create something...sequences. The script had a few bugs that were causing emails to send at the wrong times. Claude Opus 4. 6 found all the bugs in one go. It read through the entire codebase, spotted the issues, and gave me fix code. I'd even explained why each bug was happening. GPT 5. 3 codeex also found the bugs
tech audience that's watching this. So what we did instead was we said we'll put in a node and the node is called code in JavaScript but we'll go to claude we'll ask claude or any model you can use any model to do this we'll dump in all the results we got from this
OpenAI just dropped GPT 5. 2. The model has three variations. Instant for quick everyday stuff like writing and translation. thinking for the heavy lifting, coding, spreadsheets, presentations, long documents, and pro for the hardest problems where you absolutely need the right answer. The good news is OpenAI is giving you automatic model choice. You don't have to pick...actually help with strategic planning. OpenAI is celebrating their 10-year anniversary this week. After Google's Gemini 3 was crushing benchmarks and Anthropics Claude Opus 4. 5 was beating them on coding, OpenAI went into code red mode and they came out with this new model. We've made a dedicated video testing if it's actually better than
right to build the design for the game cuz I think grock's deep search is the best design tool on the planet then we use Claude 3. 7 Sonet to write the code we put it into cursor we perfected the code in cursor we put it on GitHub and now it's live on the internet and this...took all what 10 15 minutes didn't write a single line of code the AI did it all for us this is amazing if you haven't played with grock 3 yet Claud son yet you're missing out when new technology drops you need to be using it this is how you get ahead right this
your website. So, for example, for me, I could set up a website right here. And we could put this in the code section, for example, and just add a new folder right there that Claude Co-work can run from, right? So, let me do that. So, we're going to go to code. Then, we're going...game is fun. dopamine inducing crazy colors, etc. Then we're going to hit let's go. And what you can see here is like Claude Co-work is going to begin to code that out and build it out. Right. Whilst we're waiting for that to load, someone asks, is there an offer on right