# Gemini 3.1 Pro SPOTTED! GLM-5 Open Source Model DROPPED

## Метаданные

- **Канал:** Universe of AI
- **YouTube:** https://www.youtube.com/watch?v=FIkoDJLzP4c
- **Дата:** 12.02.2026
- **Длительность:** 8:57
- **Просмотры:** 10,325
- **Источник:** https://ekstraktznaniy.ru/video/9494

## Описание

Google is testing Gemini 3.1 Pro (spotted on Artificial Analysis Arena), while Z.ai just dropped GLM-5—a massive 744B parameter open-source coding model trained entirely on Huawei chips. We break down the benchmarks, the geopolitics, and what this release sprint means for the AI landscape.

For hands-on demos, tools, workflows, and dev-focused content, check out World of AI, our channel dedicated to building with these models:  ‪‪ ⁨‪‪‪‪‪‪‪@intheworldofai 

🔗 My Links:
📩 Sponsor a Video or Feature Your Product: intheuniverseofaiz@gmail.com
🔥 Become a Patron (Private Discord): /worldofai
🧠 Follow me on Twitter: https://x.com/UniverseofAIz
🌐 Website: https://www.worldzofai.com
🚨 Subscribe To The FREE AI Newsletter For Regular AI Updates: https://intheworldofai.com/

Gemini 3.1 Pro, GLM-5, Google AI, Z.ai, AI Models, Open Source AI, Machine Learning, AI News, Claude, GPT, LLM, Coding AI, AI Benchmarks, Huawei, AI Development, Tech News, OpenAI, Anthropic, DeepSeek, SWE-bench, AI Agents, La

## Транскрипт

### GLM 5 Release []

Z. A. I. released GLM5 today, a 744 billion parameter model that's open- source, MIT licensed, and trained entirely on Huawei chips. It's closing the gap with Cloud Opus 4. 5 on coding benchmarks, and it's available for download right now. Here's what's in the release. GLM is the fifth generation model from Z. AI, a Shingua University spin-off that went public in Hong Kong last month. The specs are pretty incredible. 744 billion total parameters, 40 billion active per task, double the size of GLM 4. 7, their older model, and is trained on 28. 5 trillion tokens. It also has a 200,000 token context window, and it's MIT license available on HuggingFace. Anyone can download it, modify it, and use it commercially. Z. AI is positioning this as a shift from Vive coding to Agentic Engineering. Instead of just generating code snippets on request, GLM5 is built for autonomous development work, navigating repositories, debugging builds, refactoring systems with minimal human oversight. Whether that's a marketing or a real capability shift, the benchmarks show improvement on the software engineering bench verified. GLM 5 scores 77. 8%, cloud opus 4. 5 sits at 80. 9% and GLM 4. 7 73. 8%. So, the new model provides a four-point improvement over the previous generation. On the terminal bench 2. 0, GLM 5 scores 56. 2, which leads all open-source models currently. On the vending bench 2, a benchmark that measures long-term operational capability, GLM actually scores number one among open-source models and is ahead of Chat GPT. On humanity's last exam, GLM scores 50. 4 4 points ahead of Cloud Opus 4. 5, Gemini 3 Pro and GPT 5. 2. The model is available now on Z. AI's platform open router and hugging phase. For developers, this is a solid alternative for coding and agentic workflows. For the broader industry, it's another indicator that the gap between open-source and closed models is narrowing over time. What you're looking

### Voxel City [2:10]

at right now is an actual voxal city that GLM5 coded for us. And we can see this is actually a pretty solid generation. The buildings actually look pretty accurate. They look like buildings and they are vary in sizes. What's really cool is how the city kind of looks symmetrical and kind of planned out. We can also see that it generated cars. So we can see this car over here at the bottom. I don't know where this car is going. It's like out of the city, but it's kind of cool. And we can see more cars here. Let's see what happens if these two cars do they intersect? No, they don't. Okay, so looks like this cars can drive out of the city, which is kind of cool. But then we also have the day night. We can turn off auto rotate on. I guess this one doesn't work. We can drag and zoom and everything like that. So this is not bad at all. Now

### GLM-5 vs Claude Opus 4.5 [2:55]

what you're looking at right now are two versions of a mini golf game. So this one is actually from Cloud Opus 4. 5. So over here I'm using it in arena doing the sideby-side test. And then this one is from GLM5. So the cloud opus 4. 5 you can see looks aesthetically pretty nice and if we compare that against GLM's one kind of looks similar like they both start off the same then we have a scorecard here we have a legend we have power bar and we have a map and apparently I think there's nine holes that they have created each so we can look like even this one power bar here this one I would say cloud opus 4. 5 has a little bit better in aesthetic wise but let's try out like the game a little bit if we look at this one for example Well, the physics are there. That was kind of sensitive. Like, I just touched it and I was able to like launch the ball. I can increase the speed. We can go to the next hole. Uh, this is pretty good. Like, it's kind of fun. I'm not going to lie. I'm There we go. So, we got that. Now, let's just check out the GLM one. So, the first hole looks pretty similar. Let's see if I can get a hole in one. Let's go. Now, let's go to the next hole. Okay, this one looks tough. Looks like if I go out of here, it releases. So, I got to be in here. Okay. The physics on this one is kind of messed up cuz why did that happen? Unless the, you know, the bumpers. Yeah, the bumpers are supposed to bounce the ball, I think. No, actually, these might not be the bumpers. Let me try one more time. The physics of the walls don't really work, I guess. Triple bogey. Okay, this is kind of crazy. We have bumpers. We have and water, I think. Let me try bumping into the water. What happens? One stroke penalty. Okay, so now you can see I'm at -2. Let me try using the bumpers. Okay, the bumpers work, but the physics are a little bit iffy. So, it's not bad at all. Let me compare this against Let's see if we The physics in this one are good. What does it do? Oh, yeah. It sets me up again. Yeah, the bumpers work on here like where Sorry, not the bumpers. The walls actually work. So, to be honest, I'm still going to give GLM cut it some slack because it didn't claim to beat Opus 4. 5 and everything, but if you're using an open source model and it's supposed to create something similar like Opus 4. 5, this one kind of does the job. It's not bad at all. It does pretty good. Uh, yes, there's some mess up in the physics of the game, but overall, it's a pretty good generation. A reference to Gemini

### Gemini 3.1 Leaked [5:27]

3. 1 Pro preview appeared on Artificial Analysis Arena earlier today. This is a third-party benchmarking site that tracks AI model performance. The model showed up in their internal database alongside dozens of other models, but Google hasn't announced anything officially. Google released Gemini 3 Pro in November of 2025. It's currently their flagship reasoning model available in preview. Earlier leaks suggested that this model might become GA, meaning general availability. But right now, we're seeing leaks of 3. 1 Pro, which means they might be jumping the gun a little. So, what would a 3. 1 update mean? Historically, point releases in the Gemini line haven't been major architecture changes. They're typically performance tuning, bug fixes, tool support updates, possibly extended context or output limits, or fine-tuning for specific use cases. For reference, Google shipped multiple minor updates to Gemini 2. 5 Flash throughout late 2025, often without a lot of noise. Some were just new model IDs with incremental improvements. The fact that this showed up on artificial analysis suggests Google is testing 3. 1 internally and may have submitted it for benchmarking. You run it through third party evals to verify your performance claims. But a few things to monitor. Google's API change lock. If this is real, they'll update the developer docs with a new model ID like Gemini 3. 1 Pro Preview, and that's the signal it's actually shipping. Benchmark scores. If 3. 1 is a real update, we should see improved numbers on the software engineering bench, GPQA Diamond, or other reasoning benchmarks. Timing. Google's been in an intense release cycle. They dropped Gemini 3 in November, upgraded Vio to 3. 1 in January, and have been iterating on Flash variants monthly. A 3. 1 Pro release in February would fit the pattern given the new models from OpenAI and Anthropic. But here's the thing, this could also just be noise. Third-party leaderboards sometimes show internal test builds that never ship publicly. Model names get reused, API endpoints get renamed for testing, and artificial analysis pulls data from various sources. It's possible they're seeing a staging environment or a renamed variant of the existing 3 Pro. Google's officials doc still only reference Gemini 3 Pro preview and Gemini 3 Flash preview. There's no mention of 3. 1 anywhere. And given how vocal Google usually is about a model release, it's odd they soft launch a point update without any announcement. The leak also suggests that the pressure might be getting to Google. OpenAI is supposed to be dropping GPT 5. 3 any time now and Anthropic's been iterating Cloud Opus and Sonnet aggressively. Deepseek is also supposed to enter the picture near the Chinese New Year and Z. AI dropped GLM5 today with competitive coding performance. Point releases like these are how Google stays competitive without retraining from scratch. Treat this as a credible hint, not a confirmation yet. And if you're building on Gemini today, stick with the documented preview models. Watch the change log. When 3. 1 becomes official, Google will definitely make it clear. But if the pattern holds, we'll know within this week. Make sure to subscribe

### Outro [8:39]

to our channel. We do real tests, not just headlines. Make sure you're also subscribed to the world of AI. And don't forget to check out our newsletter for deeper breakdowns you won't see on YouTube. And I'm growing my Twitter following, so make sure you follow me on Twitter as well. Hope you guys enjoyed today's video and I'll see you in the next