Gemini 3.5 LEAKS: Google Tests New Models on LM Arena!
10:16

Gemini 3.5 LEAKS: Google Tests New Models on LM Arena!

Universe of AI 14.12.2025 7 387 просмотров 153 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Google is quietly testing new AI models on LM Arena, including two labeled Fierce Falcon and Ghost Falcon, which may be part of the upcoming Gemini 3.5 lineup. Watch me test them out! There has been no official announcement yet, but the signals suggest Google is actively validating new models before a broader release. For hands-on demos, tools, workflows, and dev-focused content, check out World of AI, our channel dedicated to building with these models: ‪‪ ⁨‪‪‪‪‪‪‪@intheworldofai 🔗 My Links: 📩 Sponsor a Video or Feature Your Product: intheuniverseofaiz@gmail.com 🔥 Become a Patron (Private Discord): /worldofai 🧠 Follow me on Twitter: /intheworldofai 🌐 Website: https://www.worldzofai.com 🚨 Subscribe To The FREE AI Newsletter For Regular AI Updates: https://intheworldofai.com/ #Gemini35 #GoogleAI #AILeaks #LMArena #GPT52 gemini 3.0,google ai,google gemini,gemini 3.0 flash,gemini 3.5,gemini 3.0 pro,gemini 3.0 flash,gemini ai,lm arena,ai leaks,ai news,llm,google deepmind,sundar pichai,claude ai,anthropic ai,gpt 5,future of ai,ai models,ai update,ai comparison,skyhawk model,seahawk model,ai benchmarks,gemini 3,gemini 3.0 coder,gemini 3.0 pro coder,gemini coder,build full stack apps,google ai,agi,gemi 3.0 pro,gemini 3.0 demo,ai web design 0:00 - Intro 0:54 - Subway Surfer Test! 2:13 - Making a Poker Game! 4:09 - Making a Chess Game! 6:32 - Web Based OS! 8:22 - SVG Code Test! 9:49 - Outro

Оглавление (7 сегментов)

  1. 0:00 Intro 174 сл.
  2. 0:54 Subway Surfer Test! 278 сл.
  3. 2:13 Making a Poker Game! 423 сл.
  4. 4:09 Making a Chess Game! 485 сл.
  5. 6:32 Web Based OS! 367 сл.
  6. 8:22 SVG Code Test! 296 сл.
  7. 9:49 Outro 92 сл.
0:00

Intro

Google is currently testing two new AI models on LaMarina called Fierce Falcon and Ghost Falcon. They haven't been officially announced, but Google has used this approach before, quietly testing early versions of models in public comparison tools like LaMarina before releasing a more polished version later. In past Gemini releases, this testing phase has helped Google find issues early and then ship a stronger model to everyone after. The timing also matters. Open AAI has just released GPT 5. 2, 2, which raises the bar again on speed and capability. At the same time, Google has also been testing upgrades to its image generation models on Elmarina as well. Taken together, this suggests that Google is under real pressure to move quickly, and these Falcon models could be part of the next stage of the Gemini 3 lineup. So, today I'm going to show you these stealth models, as well as show you some examples of how powerful they are. So, let's get into it. So on Elm Marina, when you put two
0:54

Subway Surfer Test!

anonymous models against each other, you don't really know what model you're going to get. So I've had to take some time and, you know, give it multiple prompts to finally get the stealth models. So what you're seeing right now is that I asked the model to create Subway Surfer and I have two models that created some different outputs. So the first output we're going to see is actually from Ghost Falcon. And we can see that if we look at it, it looks like it's not really a human, but the game function works. like I can collect all these coins. I can move. It's pretty clear. Um, so it's not bad. This is done by the new Ghost Falcon model. Uh, so let's die here. And it keeps our score and everything like that, which is good. But if we look at the other version, so this one's actually made by Cloud Opus 4. 5. So UI perspective, the Cloud one is obviously better, but when you actually run the game, you have a human, but it's kind of a little bit buggy. Uh, it's like, you know, obviously not the best compared to that. Like I don't know what's happening with the buildings on the side. So I still think that Cloud Opus 4. 5 obviously has a better release cuz the Google model is not bad but like obviously this one looks way better and then like the score system, coins and everything like that. It keeps a high score compared to this. I don't think it really has a high score. It's just very like whatever you score. Then I actually
2:13

Making a Poker Game!

asked the models to create a poker game. So we have two generations here. one from Ghost Falcon and then the other one from Deepseek version 3. 2, but for some reason the Deepseek version 3. 2 kind of broke and I guess it didn't produce any output. Uh, so that's kind of weird even though like this model is supposed to be pretty good. But if we look at the Ghost Falcon one, so I've tested it out once, but I will show you guys once again what it is. So you can deal your cards. So the UI looks pretty good. Uh, the only thing that's buggy is that it keeps on dealing your cards the same ones every single round. So, we got four and six and we have ace 10 and a king on the board. Uh, just for the sake, I usually would fold here, but I'll just raise 50. So, I've raised it to 950. As you can see, it every time it goes through somebody else's turn, it kind of glitches out. Not glitches out, but like it looks like it's generating for the first time. I'll just check. So, then it's this guy's turn. He checks. He folded. So, it's me and this guy. We got this. I'll just check again. Okay. So this guy is the winner. He has queens and sixes. So we got four and six this time. Let's see if I press deal cards. Do I get different versions of the card? Okay. Yes, I do. So at least that works. Um but the pot level is 50. So it's not bad. Like the UI is good. The game is good. Uh if I can raise. The only thing is like I can only raise by 50. I can't really raise by anything else. So that is weird. Unless there's Yeah, that's the only thing I can only raise by 50. So once again, I'll just see like if it ever lets me win. I'll just check. So this game I had $850. So he's keeping track of how much money I have, which is good, which is a good sign. So this guy got $450. So next round, yep, it looks like it got more money over there. And I have my $850. So it looks like the game is good. And this is working pretty well. And this is not bad for what I was able to create compared to this deep seek version where there is no preview.
4:09

Making a Chess Game!

And I think this example of me asking the models to create a chess game really shows how good this new model could be, which is the Ghost Falcon one. So right now, what we're looking at as the Kimmy K2 thinking turbo model, which is once again a open- source model that's supposed to be really good, but it looks like it wasn't even able to generate the chessboard. So I can't even click any of these buttons. I can click on new game. all these buttons. It looks like it, but you know, nothing is really happening. Like I don't know why it's like that. But then we have the Ghost Falcon one which kind of looks like the Apple version of the game. But the funny thing is why is it missing pieces here which is interesting. So if I flipboard Yeah, I don't have a thing. Like I don't have my I forgot what this piece is called to be honest. But I don't have those. So I guess I'm already at a disadvantage. But if I click on this, can I move it? No. Cuz it's white's turn. Um so I guess I'll make move. Now it's black's turn. Yeah. Like, okay, I guess I have an imaginary piece. No, I don't. I don't have anything there. Interesting move. I guess the AI is giving me a disadvantage, but it's not bad. Let me see if I can Oh, there's a undo. Undo not implemented in this version. Interesting. Can I do a new game somehow? I guess I have to play this. Let's see if killing works. Okay, it looks like I it got that part of the game right. We can move the queen here. Let's give ourselves a check. mate. Wait, I can't do that for some reason. I'll just kill the queen. Uh, kill blah blah. Let's move here. Okay, so it looks like some basic things are working. It's weird how there's no piece here. Uh, can anybody put in the comments what this piece is called? Cuz I'm forgetting right now, so I'm not getting it. But this is really good because the UI is good. Uh, you guys think I'm glazing, but I'm not trying to glaze. I'm just being honest. Like making a chess game is hard. Uh compared to this where there's no generation versus this where the generation is actually pretty good. Uh it's not bad, you know, like one, two, three, like eight, it's eight. It's eight by eight. One, two, three, four, five, six, seven, eight. Yeah, 8 by 8. So you got that it got the movement of all these pieces, right? So we like all these are working. So it's pretty good. And obviously there could be some improvements, but for a version like this that we're seeing, it's pretty good. So, what I'm about to
6:32

Web Based OS!

show you guys right now is somebody using the new flash models or the leaked models and one shot coding a Mac OS web as well a Windows OS browserbased systems. So, let's take a look at the Mac one first. So, if you look at it, it looks pretty good. Like the UI looks good and it has added a files feature, a internet feature, a notes feature, and the terminal. So, we're in the files feature. There's nothing here. Can we click on this stuff? Looks like not. But if we look at this, it opens up the files. There's nothing there. Uh, which is okay. And then if we look at the browser, it opens up Google, which is pretty good. And like it looks pretty realistic. Uh, let's search up something. It looks like it has That's kind of funny. It has loaded pre-arched up AI stuff. So, let's search up AI Studio, I guess. Can we actually click on this? Okay, obviously not, cuz it's a fake version. Uh, and then we also have a terminal, which is pretty good. Then let's look at the Windows. So the Windows one is a little bit better. Looks more like the Windows. It added Spotify, Twitter, and Notepad. Uh let's open up Spotify. That'll be interesting to see. Does it even open up? Looks like not. Twitter. No. Chrome. It does open up Chrome. There's a Wikipedia that we can go on. Um let's just see if it does anything. Okay. Welcome to Wikipedia. Oh, interesting. Let's just click Okay, so the Wikipedia seems to work. Let's see if we can go on like YouTube on here. Does that work? Nope. That's okay. And the files, documents, downloads, projects. I can't click on any of these things. The terminal is there. You know, this is crazy cuz we are getting AI to code things like this. But if you imagine what it was like couple of years ago, not even maybe even a couple of months ago. The progress in the space is really good and like, you know, it's really exciting to see what we're able to create now. So on Twitter, there's
8:22

SVG Code Test!

actually a Gemini 3 Flash. That's what they're calling the new model. It could be Gemini 3. 5, Gemini 3 Flash. We will only know once Google actually releases this. But to show you guys what it has created, before I do that, I want to show you guys what Gemini 3 used to look like back in the day and compared that to the new images. So before we get in there, if you look at this, somebody asked the models to create a SVG of an Xbox controller. So GPD5 created a Mickey Mouse version over here with levitating buttons which is probably a feature that will be available in 2045. And then we have Sonnet which is kind of creating like it looks like made in like I don't know Microsoft Paint maybe. Then Gemini 3 is actually pretty good. Now if we look at the new stealth models this is the upgraded SVG code. So it looks obviously way better compared to all the previous versions of models that we have been probably used to. Um, the buttons look pretty good. It looks a little bit 3D, like adding these, you know, shadow features and stuff like that make the controller more realistic. So, that's good to see. And if you guys think I'm glazing, this guy is a reputable source in the AI community. And he is saying, "Holy Gemini 3 Flash is insane. Maybe it'll be nerfed after launch. Hopefully not, but this is super fast. " And with my anti-lazy prompt, it output about 3,000 lines like nothing. So, we can see that these models are getting stronger. and the Gemini 3 flash model might be state-of-the-art as well and OpenAI once again might be put under the pressure.
9:49

Outro

If you enjoyed this video, this is what we do here. Fast, clear updates on the biggest moves in AI. If you want to stay ahead of everything happening in this space, make sure you're subscribed. And if you want the hands-on side demos, tools, workflows, and everything developers can actually build, well, check out the world of AI. We also run a simple no noise newsletter that gives you the most important AI tools and updates in just a couple of minutes. Subscribe here. Follow World of AI. Join the newsletter.

Ещё от Universe of AI

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться