# Google’s New Breakthrough Brings AGI Even Closer - Titans and Miras

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=BTeNmrv6gPA
- **Дата:** 05.12.2025
- **Длительность:** 10:12
- **Просмотры:** 20,489
- **Источник:** https://ekstraktznaniy.ru/video/12627

## Описание

Want to stay up to date with ai news - https://aigrid.beehiiv.com/subscribe
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:
https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

Music Used

LEMMiNO - Cipher
https://www.youtube.com/watch?v=b0q5PR1xpA0
CC BY-SA 4.0
LEMMiNO - Encounters
https://www.youtube.com/watch?v=xdwWCl_5x2s

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Транскрипт

### Segment 1 (00:00 - 05:00) []

So Google may have just solved one of AI's biggest weaknesses with their new breakthrough. Let's talk about it. So Google probably just solved AI's biggest weakness that is memory. So every AI use, you know that chat, GBT, Claude and Gemini, all of them have a temporal problem. And this is where Google have introduced Titans and Mirrors. These are two different research papers that are helping AI to have long-term memory. And trust me, this is gamechanging. So, I don't know about you guys, but you know how when you're having a long conversation with AI, the more they tend to forget. Like, if you're trying to make them read an entire book, they kind of lose track of what happens at the beginning by the time that they reach the end. And that's because of the technology that they build on called Transformers. It basically just gets exponentially slower and more expensive the more text it has to remember. This has basically been a fundamental, you know, limitation that nobody could fix until now. So, Google, like I said, just published two research papers that change everything. Titans and mirrors. Titans is the brand new AI architecture that gives models actual long-term memory. We are talking over 2 million tokens of context. That's multiple entire books remembered correctly. But it's not just about storing more. You see, Titans copies how humans brains works. It has this kind of surprise metric that prioritizes unexpected important information that ignores boring routine stuff exactly like your brain does. What's even crazier is that it can learn and update its own memory while it's running, which is something no other AI can do. And Miras is the theoretical breakthrough underneath it all. It basically reveals that every major AI architecture, transformers, RNNs, member and everything is secretly doing the same thing just differently. And this framework actually opens the door to designing way better memory systems. Now this is the Mac architecture. This diagram shows how Titans organizes its memory. And it's directly inspired by how human brains actually work. So scientists have known for decades that humans don't just have one memory. We actually have different types of memory systems that handle different jobs. And this is where you guys can see right here that Titans essentially copies this design. If we break down these three layers here, we can break it down pretty easily. The top layer is where you've got contextual memory. This is the learning part. This is the long-term memory module. So unlike previous air systems that used a simple vector matrix to store memories basically, which are just a pile of numbers. Titans uses something way more powerful, a multilayer perceptron. This is a fancy term for a small neural network inside of the bigger neural network. Think of this as basically having a mini brain dedicated to remembering stuff. And this long-term memory doesn't just passively record things like a security camera. It actively learns. It figures out patterns, themes, and connections between things that might be thousands of words apart. If you mention a character named Bob on page one and then call them the tall man on page 500, good long-term memory can connect those the same person. And the middle layer is the core, the incontext learning part. This is essentially the same mechanism that made Transformers famous. It's really good at precise short-term memory. Like if someone asks what was the last word I said. This is the part that handles that. It's looking at the immediate context and figuring out what's relevant right now. The cover part of Titans is how these two layers work together. A long-term memory compresses and summarizes everything from the past, then hands down a summary report to the attention layer. The attention layer can then decide, do I need to look at this summary of the past or use immediate context enough? It has the option to use long-term memory or to ignore it depending on what's needed. And the bottom layer is persistent memory. This is the fixed weights part. This is the knowledge that's baked in during training and does not change. Think of this as your instincts or your fundamental knowledge. Things like understanding grammar, knowing that fire is hot, or recognizing that a dog and a puppy are related concepts. This layer provides the foundational intelligence that everything else builds on. The magic happens when all three layers work together. The persistent memory provides the base knowledge. The long-term memory tracks everything and important what's happened. And the core attention focuses on what's immediately relevant. It's kind of like having three different memory systems all working together as a team. just like how humans do. And I'm pretty sure you can start to tell why this is an important step towards AGI. Okay, so this is where things get really interesting from a scientific perspective. Mirrors isn't a specific AI model that you can download and use. It's more like a discovery. discovery, a theoretical framework that reveals something profound about how all sequence models work. Here's what the researchers figured out. Every major breakthrough in AI sequence modeling, transformers, RNN's, mamber, all of them, they're all secretly doing the same thing. They're all different ways of just building what's called associative memory. An associative memory is simply a system that connects inputs to outputs, keys to values, questions to answers. Think about it like this. Imagine there are a 100 different car manufacturers and they all claim their cars are totally unique and revolutionary. Then an engineer just comes along and says, "Actually, every single car is just four wheels, an engine, and a steering mechanism. You're all just doing the same thing just with different designs. That's what mirrors does for AI. It unifies everything. " Mirz defines any sequence model through four design choices. Number one is the memory architecture. This is the physical structure of the memory. Where and how do you store this information? Some models use a simple vector just like a list of numbers. Some use a matrix, a grid of numbers. And Titans uses a deep neural network which is a way more complex and powerful. The architecture determines how much information you can store and how flexibly you can organize it. Then you've got attentional bias. This is what the model pays attention to. Every model has an internal objective. It's trying to optimize. When new information comes in, the model has to decide, is

### Segment 2 (05:00 - 10:00) [5:00]

this important? Should I focus on this? The intentional bias determines the priorities. Different models have different biases, which is why the models behave differently on the same output. This is the retention game. Now, this is where you've got the forgetting mechanism. Here's something that most people don't realize. Forgetting is actually just as important as remembering. If you remembered every single detail of every single moment of your life, your brain would be completely overwhelmed and useless. You actually need to filter things down. The retention gate controls what gets kept and what gets tossed. Mirrors reframes this as regularization. Basically, rules that prevent the memory from going crazy and keeping everything. And this is where you've got the memory algorithm. And this is how the memory actually updates. When you learn something new, what's the exact mathematical process for incorporating it into existing memories? Different algorithms have different trade-offs between speed, accuracy, and stability. Here's the breakthrough insight from Miras. Almost every successful AI model until now has used something called mean squared error, MSE, for both attention, and retention. And MSE is basically measuring the distance between what you expected versus what you got and then squaring it. It works, but it has problems. Specifically, it's really sensitive to outliers. One weird data point can kind of throw everything off. Mirez opens the door to exploring alternatives. Using this framework, researchers created three new models. Y AAD which is something called Huba loss instead of MSE and this makes it more robust to errors and outliers. There's one weird type in the document. The Y A won't freak out about it. Then you've got Moneta. This just explores stricter mathematical rules, generalized norms for both attention and forgetting. And it's investigating whether more disciplined math leads to better stability. Then you've got Memoriala, which is forcing the memory to behave like a probability distribution. And this guarantees that the updates are always balanced controlled preventing chaotic memory states. Now this is where we have the power of deep memory. This graph is really interesting. This graph is proof that the Titans actually works and it reveals something crucial. Depth matters a lot. Let me explain what you guys are actually looking at. The X-axis shows sequence length. Basically how long the input text is and the Y axis shows perplexity. Plexity is a measure of how confused or surprised the model actually is. Lowerlexity is better. It means a model understands and predicts the text well. Hyperplexity means the model is struggling and making more mistakes. Now look at the different colored lines. You've got Mamba, which is an existing popular model compared against different versions of the Titan's memory system. The key thing to notice is what happens if the length increases. Mamba's line goes up as sequences get longer. That means it gets more confused and makes more mistakes when dealing with long text. This makes sense because Mamba compresses everything into a fixed size memory. So longer sequences mean more information is getting squashed and lost. But look at the Titans variants L LMM and MM. Their lines stay much lower and flatter. Even as the sequence length increases dramatically, they maintain their performance. They're not getting confused by longer context. The researchers ran this experiment at two different scales. 360 million parameters and 760 million parameters. That's model size. And at both scales, the same pattern holds. Deeper memory architectures maintain better performance on longer sequences. And why does depth help so much? Think about it like this. A shallow memory is like trying to summarize a book by writing one sentence about it. You can do it, but you're probably going to lose all the details. A deep memory is like writing a full book report with all the sections about the characters, plot themes, and analysis. You capture every nuance and detail. The deeper your memory network, the more sophisticated the compression and understanding what it could be. Instead of recording just what happened, deep memory can capture why it matters and how it connects to other things. And this is truly essential for capturing long documents. The practical implication for this is huge. If you want an AI that can, you know, read and understand entire books, legal documents, medical record, code bases, you need deep memory. The shallow approaches that we currently have will always hit a wall. And this is where we have something super interesting, too. This is probably one of the most impressive results in the entire research paper. And it shows titans absolutely destroying the competition on a task called Babylon. Here's why Babylong exists. It hides facts. It hides facts throughout an extremely long document and then ask questions that require finding and connecting those facts. The documents can be over 2 million tokens long. That's like multiple entire books worth of text. And the AI has to read all of it, remember the important facts buried inside, and then answer questions correctly. Look at what happens when the sequence length increases. GPT4, one of the most powerful and expensive AI models in the world ages ago, crashes hard. The accuracy drops from decent to basically useless as the documents get longer. And this isn't some tiny model. GPT4 is massive and cost billions of dollars to train. Mamba and other baselines also crash. They just can't handle the long context. But if we look at Titans, look at that line. It stays high even at extreme sequence lengths where everything else has failed. Titans maintains that super strong performance. And remember guys, the kicker is that Titans is way smaller than GPT4. It has a fraction of the parameters and it costs compute and yet it beats GPT4 on every task by huge margins. So why does this matter in the real world? Think about all the situations where you need AI to really understand long context. Lawyers deal with contracts and case documents that could be hundreds of patients pages. Medical records, a patients complete medical history might span decades. Scientific research, research papers built on previous work. Understanding the full context of a field of work requires processing massive amounts of text. Coding, you've got large code bases, personal assistants. You know, if you've been chatting with an AI for months, you wanted to remember absolutely everything. And this is why Google researchers are so excited about this.

### Segment 3 (10:00 - 10:00) [10:00]

This is not just an incremental improvement. This is opening doors to applications that were simply impossible before. I think this breakthrough is super important towards AGI as we start to move toward human level architectures have memory just like ours.
