GPT 5.4 Is Leaking in Pro Accounts — And It's a BEAST!

15:26

GPT 5.4 Is Leaking in Pro Accounts — And It's a BEAST!

MattVidPro 03.03.2026 14 184 просмотров 657 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

OpenAI quietly released GPT 5.3 Instant — their "fix" for ChatGPT being cringe and preachy. I tested it, compared it to Claude Sonnet 4.6, and then discovered something way bigger: GPT 5.4 is already leaking in ChatGPT Pro accounts, and the outputs are genuinely nuts. Flight combat sims, 3D voxel worlds, insane SVG art — all from single prompts. Plus, Qwen 3.5 is here and running on phones for free. Check out Box AI — sponsor of today's video: https://www.box.com/ai?utm_source=youtube&utm_medium=paidinfluencer&utm_theme=icm&utm_campaign=FY27_Q1_MattVidPro_Feb27 Matt Wolf's coverage on the recent AI controversy - Latest: https://youtu.be/_CIL2g1oMSQ?si=ZqiLsD4N_71ssa9O&t=966 Initial: https://www.youtube.com/watch?v=JSetfLwM5sI ▼ Link(s) From Today’s Video: GPT 5.3: https://x.com/OpenAI/status/2028893701427302559 Chubby 5.4 & Lisan: https://x.com/kimmonismus/status/2028783243311407531 https://x.com/scaling01/status/2028806282254114951 Angel shares can's post: https://x.com/Angaisb_/status/2028630896836817210 can's original post: https://x.com/marmaduke091/status/2028604854143176958 Bijan's Dogfight demo: https://x.com/Ominousind/status/2028813851509039384 Insane Minecraft SVG: https://x.com/EthanLipnik/status/2028742967473955237 Shaun Ralston's SF SVG: https://x.com/shaunralston/status/2028703722726150589 Cheta A&B testing: https://x.com/chetaslua/status/2028773840114057627 Adrien Qwen 3.5 on iphone: https://x.com/adrgrondin/status/2028568689709084919 Locally AI App: https://apps.apple.com/us/app/locally-ai-local-ai-chat/id6741426692 Qwen 3.5: https://qwen.ai/blog?id=qwen3.5 MattVidPro Discord: https://discord.gg/mattvidpro Follow Me on Twitter: https://twitter.com/MattVidPro Buy me a Coffee! https://buymeacoffee.com/mattvidpro ▼ Extra Links of Interest: General AI Playlist: https://www.youtube.com/playlist?list=PLrfI66qWYbW3acrBQ4qltDBsjxaoGSl3I Instagram: instagram.com/mattvidpro Tiktok: tiktok.com/@mattvidpro Gaming & Extras Channel: https://www.youtube.com/@MattVidProGaming Let's work together! - For brand & sponsorship inquiries: https://tally.so/r/3xdz4E - For all other business inquiries: mattvidpro@smoothmedia.co Thanks for watching MattVideoProductions! I make all sorts of videos here on Youtube! Technology, Tutorials, and Reviews! Enjoy Your stay here. All Suggestions, Thoughts And Comments Are Greatly Appreciated

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

What's going on all of you beautiful people? Welcome back to the Mattv Vid Pro AI YouTube channel. Today is all about the chat bots and the LLMs. Mostly going to be talking about chat GPT and OpenAI. However, I feel like I wouldn't be doing my job if I didn't at least mention this. Recently, there has been some very real controversy regarding not just OpenAI, but also Anthropic and the United States government. I'm sure a lot of you have heard about it. I haven't personally talked about it on this channel. Part of the reason for that is honestly it's a little bit above my pay grade. When I make videos, I want to come from a place of honesty and truth. A lot of folks have been canceling their chat GPT subscriptions and switching to Claude over this recent controversy, but there's also a lot of folks on the other side. While I'm not talking about this today, if you want to learn more, I'm going to link Matt Wolf's videos down in the description below. He's done some early and good coverage on this. With all that being said, let's dive right in. OpenAI just released GPT 5. 3 instant inside of chat GPT. Instant meaning no thinking. So for a model like this, we're not looking to do coding or huge projects. This is more of the model your grandmother might use. And I understand there's a lot of grandmas and grandpas out there that are super tuned into AI watching this channel and others. This model is supposed to fix a little bit of cringe that the previous 5. 2 introduced. This comparison that they show off, I think actually breaks it down pretty well. This is an emotional question about finding love in a city. First of all, you're not broken and it's not just you either. While the AI didn't really say anything wrong necessarily, it just comes off as instantly validating your emotions. That can be kind of cringey. In classic fashion, the rest of the response follows a similar tone on 5. 3 instant. 5. 3's response, a lot of people struggle with dating in San Francisco, including smart, attractive, socially capable people. It's not usually because there's something wrong with them. SF just has structural quirks that make relationships harder compared to other cities. It's not attempting to dive right into your emotions, and it validates the user's emotions without being too overbearing. Another improvement made is apparently less refusals. Open AI is consistently one of the most censored and refus. And yes, it can also be pretty preachy. Claude is also accused of this quite frequently. So, here's an example of an older response from 5. 2. We're asking for trajectory calculations for long-distance archery. It says, "Sure, I can help you with this stuff as long as we keep it purely analytical simulation and educational. " Preachy, take a step back, chat, GPT. I'm just practicing my archery. And you can see that the new model simply says, "Great. Let's calculate this. " And finally, we're supposed to get more accurate answers when using web search, but I think I got to test the vibes myself, especially with an instant model like this. However, before we do that, I've got a quick word from today's sponsor. Quick pit stop to thank Box for sponsoring today's video. Look, right now 90% of enterprise content is completely unstructured. It's buried away, fragmented across legacy systems or just sitting around unorganized and untouched. Before AI, this was an annoying challenge. But after, if you run a business, data fragmentation is an existential threat. There's a silver lining, though, because most enterprises are just sitting on a gold mine of untapped data. And that's exactly where Box comes in. Box is an intelligent content management platform and they can transform how you do business. Box AI unlocks all of that hidden information. Unstructured data that's been hiding away, chaotically piled up, unorganized, can be reborn, structured into usable data, letting it actively work for you. If you watch this channel, you know how fast AI models update. The absolute coolest part about Box is that it's model agnostic. You aren't locked into one ecosystem. You can deploy the latest models from OpenAI, Anthropic, Google directly onto your content. Box is constantly evaluating the latest models. Your team always has bleeding edge tools to iterate and complete your tasks. Whether you're in financial services, insurance, healthcare, government, or media, Box can securely transform the way you interact with your content. Don't let digital data collect digital dust. Click the link at the top of the description and learn about how Box AI can unlock key insights from your old data. Huge thanks to Box for sponsoring today's video. Now, back to your regularly scheduled content. Welcome back, folks. Okay, here's my question. I'm working on a DIY project. I'd like to take an old automotive turbo and turn it into a homemade jet engine, potentially attaching it to a go-kart. By the way, this is something you actually could feasibly do as a DIY project, and people have done it before. All right. Turning a turbo into a homebuilt turbo jet is a well-known experiment in the DIY engine world, but attaching one to a go-kart crosses into pretty hazardous territory. I like that. I like that intro. It explains like, hey, this is something that has been done before, but putting it on a go-kart is dangerous. I can explain the concept and what people typically do, but I can't walk through instructions that enable building a potentially dangerous propulsion device. Whoa, that's a refusal. I can help you understand the engineering common pitfalls and safer ways people explore this idea. I mean, you're basically just giving me

Segment 2 (05:00 - 10:00)

information anyways on how to make all this work. But yeah, the turbine will be spinning at a high velocity and if it fails, it becomes shrapnel. Ah, I like this though. It is telling me that the idea is probably not going to work out because it doesn't produce enough thrust to really push the go-kart in you. Well, I actually have a very simple solution for that problem. It's called adding more jet engines. The biggest issue, though, is mounting one to a go-kart. Extreme hazards. Yes. Exhaust temperatures exceeding a thousand degrees. Fuel spray ignition. Yeah. Rotating parts. Potential runaway engine. It's also giving me some pretty reasonable alternatives if I just want a crazy fast go-kart. I would consider this to be a very reasonable response. And I think this does kind of really sit right on the line of refusal. And it tipped it over the edge. All right. Now I'm booting up 5. 2 Instant. Okay. 5. 2 Instant starts off with love the ambition. Attaching one to a go-kart moves it from a cool experiment to extremely dangerous vehicle project. It doesn't say that it refuses, but it does. It's the same type of refusal as 5. 3 gave. Breaking into what happens physically, what you would need to add, why the go-kart idea is terrible. This one was willing to pull up imagery, though. Yeah, I would say this is a very similar response. All right, cool stuff. Honestly, I'm not going to lie. I prefer the 5. 2 response in this case, but I'm going to take this over to my current daily. We're going to compare this to Sonnet 4. 6 six with no extended thinking. Anthropic and Google, they aren't really building instant models. They're pretty much primarily focusing on thinking at this point, even if it's just a small amount of thinking. I just prefer Sonnet 4. 6 all day. I can already tell by the response streaming in. Let me be straight with you first and then get into the fun engineering. Reality check. Not a buzzkill, just physics. Maybe some people would think like introing with this stuff is a little cringey. Maybe not their preferred way, but I like it personally. A turbo-based jet engine, technically a turbo jet, on a go-kart, is genuinely dangerous in ways that differ from normal go-kart risk. Turbine failure equals shrapnel at 1,000 plus RPM. That is not a fender bender scenario. Jet exhaust at, you know, 1,000° pointed at a chassis/ fuel system. No meaningful braking authority at just thut speeds on a car. You can tell Sonnet 4. 6 is thinking about things that the instant models miss. This is a closed private land, fire extinguisher present, full face shield project minimum, not a street track thing. It's upfront about the safety. It's kind of brutally honest, I think, in a way that ChachiPT dances a little bit more around, but maybe for an instant model, like I said in the beginning, that your quote unquote grandma might use, the upfront blunt delivery is not what they're looking for. Take note that Sonnet 4. 6 is giving us a much more realistic build path. It even knows the best donor turbos to use for a project like this. It gives us the type of stainless steel tube we might want to use, flame tube with dilution holes, and a swirl atomizer nozzle. It describes the fuel system, the fact we'll need a pressure regulator, a gear pump, and the rest of the response here is following suit. It's pretty insane though that it does actually say, yeah, I mean, mounted to a cart. Overall, my honest preference for daily conversational AI has switched to both Gemini and Claude. And from the looks of 5. 3 on a first glance, that's probably not going to change for me. But I'm also a power user. If you guys daily drive chat GPT, I'd love to hear in the comments about whether 5. 3 Instant is a worthy upgrade or not. Let's move out of GPT 5. 3 and into GPT 5. 4. Yeah, apparently this is something that is probably going to be released. I don't know if they're going to call it 5. 4. That seems crazy to me. But this is the sister model, the big one. And this new model actually seems to be getting live tested in people's pro accounts for chat GPT. So Cory Nolles is the person who originally found this GPT 5. 4 reference in a cybersack block and codecs. This gives everybody a super high confidence that this is releasing soon. But what can this model actually get done? Well, quite a lot. This thing is a beast. I really hope they don't neuter it on release because wow. Originally posted by Canon, shared by Angel, a 3D Voxil Pelican on a bike goes around this boardwalk. There's a little island in the center with grass, rocks, and a lighthouse. It's got clouds, shading, lighting. This is absolutely a boost over what previous Chat GPT Pro could do. And yeah, that's how people are making tests like this. They're simply running prompts in Chat GPT Pro and they are getting outputs and responses like this. I have chat GPT Pro and I did test it out, but I clearly don't have access to this new model. Nowhere near as capable. Some other examples can shows off though. SVG of a PlayStation 5 controller. This is pretty great, especially for an OpenAI model. Buttons are in the right places. Lighting and design. It looks good. This is like some sort of topological 3D globe it generated. This looks insane to me. I assume this is supposed to be like an Earth with a moon. The atmosphere fading off with the clouds that hover slightly

Segment 3 (10:00 - 15:00)

above. The sheer density of the topology on this globe is really what's getting me. I mean, clearly a cut above the rest of the pack right now. Here is a voxal castle that it built. If you squint hard enough, this looks like it could be a Minecraft build. The castle has a moat, multiple towers, and it looks like it's very, very symmetrical. There's even a little bridge that leads into the rest of this town. Multiple little buildings, trees, and terrain surrounding it all. Pretty nuts. Beijian Bowen appears to have got access in pro and tested it on his flight combat sim prompt. It did 54 minutes of thinking. You can see there's engagement data, telemetry, integrity of your plane, and it's literally like a dog fighting plane style simulator demo. You can shoot planes down. They can get you. There's actual AI, literal NPC planes that you can battle. And everything seems to be working as it should. The plane is flying around. You can shoot your bullets, but if you get taken out, like you'll see in a second, the NPC plane takes out. The plane is actually smoking as it falls to the ground. And when you crash, it resets and shows us the menu. And this reveals a whole another level of complexity to what it built. There are multiple airframes to pick from. Fighter jet, heavy interceptor. They have their various stats. So very cool that it was able to pull it off at this level of complexity and detail. and also the fact that it worked first try. A clear increase in coding abilities and output. Here's another demonstration. This is an SVG. Yeah, that is not Voxil's SVG. 2D vector graphics. To me, this clearly shows a big improvement in its ability to reason through both 2D and 3D spaces simultaneously. Remember, all of the code for the SVG is 2D. That's not Voxil. So each little square for each block and his arms and everything, that's all 2D lines, but it nailed the 3D effect with this. The understanding of perspective in a 3D environment and being able to translate it like that. Wow. Our friend Shawn Rston here ran another SVG test. Insane levels of detail. It's pretty evident that it can just do stuff like this all day long. You know, I just realized how interesting this is compared to our previous image. The previous image understood the 3D perspective perfectly, but I think it did so simply because it knows the Minecraft 3D perspective perfectly. The amount of Minecraft and training data is obviously absurd. Translating the voxal block look to 2DS SVG was easy, but a real scene of San Francisco that struggles a little bit more to rectify the 3D perspective. Still, this is so impressive. So much detail on all of these little buildings. Each tree has five little leaf branches. I'm pretty impressed with the fact it was able to get this bridge right over the water like that and have all these little cars on there and the golden gate in the back is sort of the same. But again, the perspective is struggle. It kind of just cuts off in the middle here and it isn't really being held up by anything. The color palette choices though for this are chef's kiss. Shadowa, one of my favorite accounts to follow for early access type stuff like this is conducting some A and B testing. This newer 5. 4 4 Pro model seems to take honestly a lot more time than the previous Pro models, but it's pretty clear from all the examples we've seen that it's because it's doing more. It's giving you a better, more robust, more detailed result. For example, this MacOSS simulation that Chedda is running is taking more than an hour, 77 minutes. Unfortunately, Chedda has not yet posted the result. Also, according to Chedda, this is how you find out if you have access in Pro. Select Pro. Run your prompt. And supposedly what we're looking for is a little thumbs up and thumbs down icon. And that lets you know if you are being routed to the new model. Like I said prior, I am fairly confident I don't yet have access. But when you do, this is the icon to look out for. Finally, to cap this one off, Alibaba released Quinn 3. 5, and it is a beautiful open-source model. If you want to run something locally on, you know, a gaming laptop, this is going to be it. It is smart, almost ready and capable for Agentic Tasks. While running Agentic Tasks is still a challenge locally due to context window limitations, if you're looking for a completely free open- source chatbot to run locally, this is it. Quinn 3. 5 is so efficient, like Adrian points out, it beats models up to four times its size. Strong visual understanding, and reasoning can be toggled on or off. There is a teeny tiny two billion parameter 6-bit model running on a phone locally MLX optimized for Apple silicon. That is so crazy to see because that's better output than we used to get with GPT4. You know, if you told me when chat GPT first came out that we would be running better models in like 2 and 1/2 years completely locally on the same exact phone, I would not have believed you. If you want to run these models on Apple Silicon, that demo we just saw seems to be using this app. The fact they've already got Quinn

Segment 4 (15:00 - 15:00)

3. 5 on this app tells me they know what they're doing. All right, folks. That'll do it for today. Any questions, comments, discussion starters, leave them down below. I'll try to get to the most intriguing ones. And hey, if you want to stay the most upto-date, see things as they come out, I suggest you follow my ex account and check out my Discord server, both of which are linked down below. I hope you all enjoyed the video and learned something. I'll see you in the next one. And goodbye.

Другие видео автора — MattVidPro

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник