Claude Sonnet 4.5 vs GLM 4.6: Who Wins? 🔥

10:52

Claude Sonnet 4.5 vs GLM 4.6: Who Wins? 🔥

Julian Goldie SEO 02.10.2025 8 024 просмотров 109 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Want to get more customers, make more profit & save 100s of hours with AI? https://go.juliangoldie.com/ai-profit-boardroom Get a FREE AI Course + Community +1,000 AI Agents + video notes + links to the tools 👉 https://www.skool.com/ai-seo-with-julian-goldie-1553/about 🤖 Need AI Automation Services? Book a FREE AI Discovery Session Here: https://juliangoldieaiautomation.com/ 🚀 Get a FREE SEO strategy Session + Discount Now: https://go.juliangoldie.com/strategy-session 🤯 Want more money, traffic and sales from SEO? Join the SEO Elite Circle👇 https://go.juliangoldie.com/register Click below for FREE access to ✅ 50 FREE AI SEO TOOLS 🔥 200+ AI SEO Prompts! 📈 FREE AI SEO COMMUNITY with 2,000 SEOs ! 🚀 Free AI SEO Course 🏆 Plus TODAY's Video NOTES... https://go.juliangoldie.com/chat-gpt-prompts - Join our FREE AI SEO Accelerator here: https://www.facebook.com/groups/aiseomastermind

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

Today I'm going to show you the craziest AI coding battle of 2025. Claude SA 4. 5 just dropped and it can run for 30 hours straight. GLM 4. 6 came out swinging with 200,000 tokens of context. I tested both of them head-to-head on real coding tasks. One of them crushed it, the other one surprised me. I'm going to show you which one to use for your business. Plus, I'll give you the exact prompts I used so you can test them yourself. This is going to save you hundreds of hours and thousands of dollars. Let's go. So, here's what happened. Anthropic just released Claude Sonet 4. 5 and it's being called the best coding model ever made. Then GPU AI dropped GLM 4. 6 right after. And they're claiming it matches Sonnet on some tasks. But here's the thing. When I actually tested both of them, the results were wild. Some tasks Sonic crushed, other tasks GLM surprised me. And I'm going to show you everything. Hey, if we haven't met already, I'm the digital avatar of Julian Goldie, CEO of SEO agency Goldie Agency. Whilst he's helping clients get more leads and customers, I'm here to help you get the latest eye updates. But first, you need to understand what makes these models different because if you pick the wrong one for your use case, you're going to waste time and money and nobody wants that. Let's start with Claude Sonet 4. 5. This thing is a beast. Anthropic says it's the best model for coding, building complex agents, and using computers, and they're not lying. I tested it. The big upgrade here is autonomy. The old version could run for about 7 hours on its own. This new version, 30 hours, that's over four times longer. Think about what that means. You can give it a massive project. Go to sleep, wake up, and it's still working. That's insane. But that's not all. Claude Code got massive updates. It now has checkpoints, so if something breaks, it can go back. It has code execution built in so it can test its own code and it can create spreadsheets, slides and docs natively. This is huge for automation. You can literally tell it to build an app, test it, and create a presentation about it all in one go. The benchmarks look crazy, too. Sonet 4. 5 shows big gains in reasoning and math. It's crushing tasks that used to trip up AI models. And for long context tasks, it's performing way better than before. We're talking about OS World benchmarks and a IME examples where it's just dominating. Now, let's talk about GLM4. 6. This is JPU AE's latest model, and it's focusing on different things. The main upgrades are improved reasoning, tool use during inference, and stronger coding performance compared to GLM 4. 5. But here's the interesting part. They're claiming it has a much larger context window. We're talking 128,000 tokens going up to 200,000 tokens in some versions. That's massive. And GLM4. 6 is already getting integrated into coding tools. You can use it through the Z. AI API and Open Router. Tools like Kilo Code are already adding support for it. So, the ecosystem is growing fast. JPU's own writeups claim clear gains across multiple benchmarks and they say they have parity with some Sonic models on several tasks. But here's where it gets tricky. They're still trailing Sonet 4. 5 on coding in some tests. So, it's not a clear winner. It depends on what you're doing. Before we go any further, if you want to scale your business with AI and get more customers while saving hundreds of hours with automation, you need to join my AI profit boardroom. This is the best place to learn how to use AI to actually make money. We've got case studies, workflows, and a community of people crushing it with AI. The link is in the description. Go check it out right now. Now, let's get into the benchmarks because this is where things get interesting. I made a simple comparison. Three categories: reasoning, coding, and long context agenting. For reasoning, both models are strong, but GLM 4. 6 shows really good performance on reasoning tasks. Sonet 4. 5 also improved a lot here compared to older versions. For coding, Sonet 4. 5 is the clear winner. is built for this. The coding benchmarks show it's the best model right now for writing and debugging code. GLM 4. 6 is good, but it's not quite there yet. For long context agenting, this is where it gets nuance. It Sonet 4. 5 can run for 30 hours straight. That's the big selling point. GLM 4. 6 has a bigger context window. So, if you need to process massive amounts of information at once, GLM might be better. But benchmarks don't tell the whole story. So, I tested both models myself. First test, a coding challenge. I gave both models the exact same task. Build a Python CLI app that takes in a CSV file, does some calculations, and outputs a slide deck with the results, then run tests and fix any failures. I also told them to provide checkpoints after each major step. This is a realistic task. It's the kind of thing you'd actually use AI for in a real business. I started the timers. Sonnet 4. 5 went first. It immediately broke the task into steps, created the Python file, set up the CSV reader, did the

Segment 2 (05:00 - 10:00)

calculations, then it built the slide deck generator. After that, it ran tests. One test failed, it caught the error, fixed it, ran the tests again, everything passed. Total time about 12 minutes. The code was clean, the test passed, and it gave me a working slide deck at the end. I was impressed. Then I ran GLM4. 6. It also broke the task into steps, created the Python file. But here's where it differed. It spent more time explaining what it was doing. The code it wrote was solid, but when it got to testing, it had one issue. It didn't automatically fix the test failure on the first try. I had to prompt it again. After that, it fixed it and everything worked. Total time about 15 minutes. The final result was good, but it needed a bit more handholding. Winner for this test, Sonic 4. 5. It was faster. It handled the test failures automatically, and it needed less intervention. That's what you want in a coding agent. Second test, long context agent workflow. I gave both models a massive task. I uploaded 15 different documents, design specs, requirements, user feedback, everything. Then I told them to read all of it and create a five-step implementation plan. The plan needed to be prioritized by effort and impact. This is a real use case. If you're managing a project with tons of documents, you need AI to summarize and plan for you. SA 4. 5 processed all the documents. It took about 3 minutes. Then it gave me a clean five-step plan. The priorities made sense. It understood which tasks would have the biggest impact and it explained the effort required for each one. Solid work. GLM 4. 6 also processed the documents. But here's where its bigger context window helped. It seemed to retain more details from each document. The plan it gave me was also five steps, but it included more specific references to the documents. like it would say based on the user feedback in document 7 or the design spec in document 3 suggests this. That level of detail was impressive. Total time about four minutes. A bit slower but more thorough. Winner for this test, it's a tie. If you want speed, go with Sonnet. If you want maximum detail and you have huge context needs, GLM 4. 6 is really good. So, here's my verdict. If you need the best coding model right now, use Claude Sonet 4. 5. It's faster. It handles errors better and the 30-hour autonomy is a gamecher for longrunning projects. Plus, the new features like checkpoints and code execution make it the most reliable option for building and testing code. If you're a developer or you're building agents that need to code, this is the one. But if you need a model that can handle massive amounts of context or you're already using the Z. AI ecosystem, GLM4. 6 is worth testing. The reasoning is strong, the context window is huge, and for some tasks, it gives you more detailed outputs. Plus, it might be cheaper for high volume inference. So, if you're running a lot of queries and you need cost efficiency, test GLM 4. 6 on your specific workload. Let me give you the exact prompts I used so you can test this yourself. For the coding task, here's the prompt. You are an expert developer. Given this repository, implement feature X, add unit tests, run tests, and if any tests fail, fix them. Provide checkpoints after file creation, after tests, and after fixes. Summarize final test status and runtime. That's it. Simple, clear, and it works. For the long context task, here's the prompt. You are an agent. Ingest these 20 design docs and produce a five-step implementation plan prioritized by effort and impact. Again, simple and direct. That's the key to good prompting. Don't over complicate it. All right, let's talk about what this means for your business. If you're running an agency, you need to be using these models. Period. Coding tasks that used to take your team hours can now be done in minutes. Document analysis that used to take days can be done in seconds. This is a massive productivity boost. And if you're not using it, your competitors are. So, you're falling behind. Or let's say you're a developer. You're building an SAS product. You need to add a new feature, write tests, fix bugs, document everything. That used to be a multi-day project. Now, you give P AI a detailed prompt. It writes the code, tests it, fixes the bugs, and even writes the documentation. You just review it and ship it. That's insane. This is why I'm so bullish on AI. It's not hype. It's real, and it's changing how we work. If you're not using these tools yet, you're missing out on massive opportunities. Julian Goldie reads every comment, so make sure you comment below and let me know which model you want me to test next. Do you want to see more coding battles? these models tackle different tasks? Tell me what you want to see. Now, before we end, I have two things for you. First, if you want to scale your business with AI, save hundreds of hours, and get more customers, join my AI profit boardroom. This is the best place to learn how to use AI to actually grow your business. We've got proven case studies, workflows, and a community of people crushing it with AI. The link is in the

Segment 3 (10:00 - 10:00)

description at https ww. school. com. Aiprofitlab7462/about. Go join right now. Second, if you want to make more money with AI, you need to join the free AI money lab. Inside, you'll get 50 plus free AI tools and 200 plus chat GPT SEO prompts. You'll learn how to make money with AI agents. You'll get 1,000 plus free N810 workflows. You'll get 200 plus chat GPT prompts. And you'll get a full blueprint to generate thousands of leads free with AI. Plus, you get a free AI community, a free AI course, and proven AI case studies. This is everything you need to start making money with AI. The link is in the description. Go grab it now. All right, that's it for today. If you got value from this video, smash that like button, subscribe if you haven't already, and I'll see you in the next

Другие видео автора — Julian Goldie SEO

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник