# This New OpenAI Leak Changes Everything About GPT-6

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=mAPMzgLymOQ
- **Дата:** 10.03.2026
- **Длительность:** 15:57
- **Просмотры:** 21,440

## Описание

🌐Subscribe To My Newsletter - https://aigrid.beehiiv.com/subscribe
Get your Free AGI Preparedness Guide - https://theaigrid.kit.com/agi
🎓 Learn AI In 10 Minutes A Day - https://www.skool.com/theaigridacademy
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

Music Used

LEMMiNO - Cipher
https://www.youtube.com/watch?v=b0q5PR1xpA0
CC BY-SA 4.0
LEMMiNO - Encounters
https://www.youtube.com/watch?v=xdwWCl_5x2s

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Содержание

### [0:00](https://www.youtube.com/watch?v=mAPMzgLymOQ) Segment 1 (00:00 - 05:00)

So, something big is brewing at OpenAI, and I'm pretty sure this leads to GPT6, so we need to talk about it. All right, so let's actually talk about how this leak actually started and all of the information surrounding the new AI model. So, on March the 8th, a guy named Atai Alleti, who works on the voice team at OpenAI, posted something on X. Now, someone had been complaining that it's been almost 2 years and we still don't have a true omniod. and Atti basically responded by asking, "What would you like to see in the new Omni model? " And that post actually blew up. It got over 50,000 views and 100 likes. Not crazy by Twitter standards, but enough for me to see it and a few other AI enthusiasts. Now, of course, that alone would have been interesting enough, but the other day, OpenAI employees started to chime in. Brandon McKenzie, a researcher at OpenAI who previously worked on multimodal AI at Apple, replied and said that a new omnimodel sounds like a great idea. And then another OpenAI team member, Huda Knight, flatout said, "It's coming with two exclamation marks, by the way. " And these aren't random people. These are engineers and researchers who are actually behind this stuff and are constantly building it. Then the next day, we had the decoder, a well-known AI news outlet, confirm that the new model that is coming appears to be a successor to GPT40. And for those of you who might not remember, GPT40 was supposed to be OpenAI's big multimodal moment back in 2024. The O literally stands for Omni. And the idea was that you could have one model that could handle text, images, video, basically everything all natively all at once. Not a bunch of separate systems stitched together behind the scenes, but one unified brain. But here's the thing. GPT40 never really delivered on the promise. A lot of the features that they showed us at the launch event were either rolled out in a limited way or never released at all. And a lot of people have been really frustrated about this and it's been 2 years of waiting. I remember when I saw the demo, it was pretty incredible. But when it was released, the voice just sounded completely flatfooted and it wasn't as expressive or as human as we previously saw in the realtime demo. I'm pretty sure some of you guys might be familiar with this. Now, if you're wondering what would the actual Omni model be, think of it this way. Right now, when we use Chat GBT, there are different kinds of models. You can type into it. You can talk to it. You can show it a picture. But under the hood, those are often handled by different systems working together. A true omni model would be one single system that processes everything at once. Your voice, an image you're showing it, a video, the text on your screen, all flowing through one brain simultaneously, as you can see by the demonstration here. So, no more old multiple systems, but just one brain that just takes in everything. And here's what makes this especially exciting. OpenAI's current smartest model is GPT 5. 4, 4 which was released a few days ago. So imagine having that level of intelligence, the best reasoning, the best problem solving they've ever built, and giving it the ability to see, hear, and speak natively, not as add-ons, not as separate features bottled on after the fact, but built into the model from the ground up. That's the potential here. And honestly, when you look at the trajectory of where OpenAI is heading with hardware and with GPT6, which I'm going to get into in a minute, this new Omni model might be one of the most important things they release because it actually depends on everything else they're building. But before we get into the hardware in the future, there's another massive piece of this puzzle that we need to talk about it. And it has to do with how we actually talk to the models. So this diagram here actually shows you how we currently talk to chat GBT. And if you've ever used it, you know that it can feel a little bit awkward. And there is a good reason for that. The way it works right now is turnbased. You talk and then you stop and then the AI processes what you said and then it talks back. Think of it like a walkietalkie. But the problem is that if you make any sound while the AI is talking, even an okay or an Mhm. to show that you're following along, the AI interprets that as you trying to interrupt it and it just stops midsentence. It sort of kills the flow of the whole conversation. But of course, according to the information, OpenAI is building a new audio model to fix this exact problem. It's called BAI, short for birectional. And the idea is pretty straightforward. Instead of taking turns, both of you and the AI can communicate at the same time, just like a real human conversation. Think about how you actually talk to another person. You nod, you say, "Uh-huh. " You might jump in with a quick question while waiting for them to finish their whole thought. And the person you're talking to doesn't just freeze up when that happens. They adapt. They might pause, acknowledge your question, and keep going. That's why by die is going to be really effective. It continually processes your voice so it can adjust in real time when you interrupt or react. Now, here's the honest part. They built

### [5:00](https://www.youtube.com/watch?v=mAPMzgLymOQ&t=300s) Segment 2 (05:00 - 10:00)

a prototype, but apparently it's not perfect yet. Apparently, it works for a few minutes and then starts glitching, producing some of the weird sounding voice notes. They originally wanted to ship this in the first quarter of 2026, but it's looking more like the second quarter, or possibly even later. But here's why this matters so much beyond just making chat GBT sound more natural. Opening eye believes that closing the gap between voice AI and textbased AI could massively expand who uses art who uses artificial intelligence around the world. Think about it. For most people on the planet, talking is a way more natural thing than typing. If they can make talking to AI feel as seamless as talking to a friend, that opens the door for hundreds of millions of new users. And there's a huge business angle here, too. Think about customer support. Right now, tons of companies are trying to use AI to handle phone calls, airlines, banks, retailers, you name it. But the experience is usually pretty rough, parti particularly because the AI can't handle the natural back and forth of a real phone conversation. Imagine you're talking to a retailer's AI assistant about returning a product and mid-con conversation you change your mind and you want to exchange it instead. With today's AI voice, that kind of pivot is clunky at best. The AI gets confused or you have to repeat yourself or the entire thing breaks down. But with BYI, the AI could smoothly adapt just like a real customer service rep would. It hears you changing direction, adjust the fly, and the conversation keeps flowing naturally. And that's a gamecher for any company using AI to handle customer interactions. And it's one of the reasons OpenAI is investing so heavily into this technology. Okay, now let's zoom out a bit because this new Omni model and by don't exist in a vacuum. They're pieces of a much bigger puzzle. And that puzzle leads straight back to GPT6. Back in 2025, Sam Alman, Opening Eyes CEO, said publicly that GPT6 is already in the works and that it won't take as long as GPT5 did. If you remember, GPT5 took a while. So, hearing that GPT6 is moving faster is a pretty big deal. And you can already see that there's real infrastructure backing this up. OpenAI has a partnership with AMD, the chip company, to deploy 6 GW of computing power. To put that into perspective, that's an enormous amount of energy dedicated to training AI models. The first gigawatt that is expected to come online in the second half of 2026, which lines up perfectly with when you'd need that kind of power to train something like GPT6. Now, if you're wondering about the release times and the release dates, most credible estimates put their timeline like this. We could see a GPT6 developer preview, meaning that researchers and app builders get early access sometime in the third or fourth quarter of 2026 and then a broad roll out to regular chat users would likely come in the first quarter of 2027. Now, there's always a chance things slip up. In a more pessimistic scenario, we might not see until mid 2027 or even Q1 of 2027, but the pieces are already being put in place. So, what would GPT6 actually bring to the table? Three big things stand out. First, long-term persistent memory. Right now, every time you start a new conversation with Chat GPT, you're basically starting from scratch. GPT6 is expected to actually remember who you are across sessions, your preferences, your past conversations, the context of your life, and it would be like talking to an assistant who actually knows you. Second, autonomous agentic capabilities. That's a fancy way of saying that the AI will actually take actions on your behalf. Not just telling you how to book a flight, but actually go ahead and book that flight. Not just draft an email, but of course, go ahead and send it. We're already seeing early versions of this. Openi's current top model, GPT 5. 4, is already pretty good at computer use where it can literally operate your computer. And GPT6 would be taking that much further. Now, of course, third, it's going to be full native multimodality, which brings us right back to where we started. The new Omnimodel that OpenAI employees are teasing could very well be the multimodal backbone that GPT6 is built around, or it could be a parallel effort that eventually merges with GPT6 when it launches. Either way, these things are deeply, deeply connected. Think of it as three layers of the same system. The Omni model gives GPT6 its eyes and ears, the ability to process images, video, and audio natively. Bye gives it its natural voice, the ability to have a real fluid conversation with you. And GPT6's raw intelligence ties it all together, giving you the smartest AI ever built that can also perceive the world the way you do. And that's the vision. And it gets even more interesting when you look at what OpenAI wants to put in this intelligence inside of. And so this is where things actually get really interesting. And this is the part that blew my mind when I was putting everything together. Openai already has 200 people working on physical hardware devices. Not software, not apps, actual things that you hold, wear, and put in your home. And this is

### [10:00](https://www.youtube.com/watch?v=mAPMzgLymOQ&t=600s) Segment 3 (10:00 - 15:00)

where the omnimodel and the buy become absolutely crucial because those devices need true multimodal AI to work properly. And without those, they are just expensive gadgets. Now, first up is of course the earbuds came. the client project coden named Gumdrop and these have been reported on by TechCrunch, Mashable, and Axio. They're open style AI earbuds, so not noise cancelling. They sit in your ear and let you hear the world around you whilst also giving you an AI assistant right there in your head. The really interesting technical detail is that they'll have a custom 2nm processor built right into them. That means a lot of the AI processing happens on the device itself rather than the cloud which is going to make things faster and more private. And then of course for manufacturing, OpenAI has been in talks with Foxcon, the company that builds iPhones and another manufacturer called Lux Share. And here's the wild thing. Their first year sales target is 40 to 50 million units. That's an incredibly ambitious number for a brand new product category. For context, AirPods sell something in that range annually. So, OpenAI is basically saying they want to compete at that scale right out of the gate. Now, next we can talk about the smart speaker with a built-in camera that's expected to cost somewhere between 200 and $300. This one's been reported on by Reuters and the information. That's 200 plus person team. You know what they're doing? They are actually building this. The camera isn't just for video calls. It's going to give the AI visual context. It can see your room, identify objects, see who's talking to you. It would even have Face ID style authentication so you could authorize purchases just by looking at it. The company Goatech is reportedly in talks to supply the speaker modules and this one isn't expected before February 2027, so this is still a bit of a way out. Next, we have the smart glasses, but those are even further down the road. Mass production isn't expected until 2028, and there's apparently a prototype of a smart lamp, although it's not really clear if that one will ever become a product. But the most exciting one is what I'll call the Journey Eye of Mystery device at Davos in January 2026. OpenAI confirmed that this device is on track for reveal in the second half of this year. Samman described it as something that's more peaceful and calm than a smartphone, no screen, small enough to fit in your pocket. Previous reporting even mentioned it may have a pen-like shape. And now a court filing revealed it won't actually ship to consumers before the end of February 2027, but we should at least see what it is in later this year. They originally planned to brand this device IO, but they had to abandon that device because of a trademark dispute. So I'm not sure what it's going to be called just yet, but whatever the name, the concept is fascinating. Think about it. Okay, if it works the way Sam Alman describes it, it could be a new category of consumer tech. And I think about that a lot. I mean, all of those AI devices for a second, think about them. I mean, earbuds that can hear you and respond naturally, a speaker that can see you in your room, glasses that you put in your AI field of vision, a pocketable device that replaces your need to pull out a cell phone. Each of these needs an AI voice that they can use, images that they can view, and context that they can be aware of, all in real time. And that's why the omni model matters. That's why bai matters. That's why GBT6 matters. And that's why we have to think about this. So the hardware is going to be the body, but the omni model brain. And the bi is going to be the voice. Without them, none of those devices are going to work. So if we take a step back and look at the full picture, because when you connect all of these dots, something really profound starts to emerge. Open AI isn't just building a better chatbot. They're building what I'd like to call an ambient AI ecosystem. The omni model is the brain. one unified intelligence that can process everything it sees, hears, and reads. And by is the voice, the technology that makes talking to this brain feel as natural and talking like you're talking to another person. And of course, the hardware, the earbuds, and the speaker, the glasses, the mystery device, that is going to be the body, and that's how it's going to show up in this entire world. So, right now, Chat GBT is the app that you open, you go to it when you need it, you type or talk, you get an answer, and then you close it. But OpenAI wants to fundamentally change this relationship. They want that to be an AI that you live with in your ears while you're walking down the street, on your kitchen counter when you're listening and watching, and eventually on your face as you go about your day. And when you combine all of that with what GPT6 is expected to bring, persistent memory, that means it knows who you are and our autonomous capabilities. That means it can go and do things for you. What you're looking at is a play at the post smartphone world. This is what opening eyes bet on is what comes after the device in your pocket. Now, some of you guys might be saying, "Well, haven't we tried this before? " The Humane AI pin launched with massive hype and was supposed to replace your iPhone, but this flopped spectacularly. Of course, that did flop. And of course, the Rabbit R1 promised a completely new way to interact with AI. And honestly, it was basically an expensive toy that couldn't do much of anything. So, why would OpenAI's hardware be any different? Why should we believe that this time will work? Well, the thing is that currently when we

### [15:00](https://www.youtube.com/watch?v=mAPMzgLymOQ&t=900s) Segment 4 (15:00 - 15:00)

look at OpenAI's users, they have nearly a billion weekly users on Chat TBT. And if you think about that for a second, a billion people already using the AI every week. That's not a scrappy startup hoping people are going to show up and, you know, give the product a chance. That's a massive established user base that already knows the AI, already trusts it, and already relies on it every single day. If you give those people an earbud or a smart speaker that connects to the AI they already use every day, that is completely different than asking them to adopt some brand new platform from a company they've never heard of. And two, we have to think about Joanie Ive. Say what you want about the guy, but his track record speaks for itself. The iPhone, the iPod, the MacBook Air, the iMac. He's almost an unmatched ability to take complex technology and make it feel simple, beautiful, and desirable. And if anyone can crack the design problem of AI hardware and making something people want to use everyday, it's probably him.

---
*Источник: https://ekstraktznaniy.ru/video/11410*