# Google, OpenAI & Anthropic All Reported the Same Threat

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=DgsX6NnF_p4
- **Дата:** 06.03.2026
- **Длительность:** 19:52
- **Просмотры:** 13,214

## Описание

🎓 Learn AI In 10 Minutes A Day - https://www.skool.com/theaigridacademy
Get your Free AGI Preparedness Guide - https://theaigrid.kit.com/agi
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Learn AI Business For Free AI https://www.youtube.com/@TheAIGRIDAcademy


Links From Todays Video:


Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

Music Used

LEMMiNO - Cipher
https://www.youtube.com/watch?v=b0q5PR1xpA0
CC BY-SA 4.0
LEMMiNO - Encounters
https://www.youtube.com/watch?v=xdwWCl_5x2s

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Содержание

### [0:00](https://www.youtube.com/watch?v=DgsX6NnF_p4) Segment 1 (00:00 - 05:00)

So, this is by far one of the most interesting stories. USA's AI labs are under attack. So, let's talk about it. So, one of the key things I've seen recently is that USA's labs have been under attack from distillation. So, let's dive into exactly what's going on here. So, you can see here, Anthropic says, "We've identified industrial scale distillation attacks on our models by DeepSeek, Moonshot AI, and Miniax. " Okay, these labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to improve their own models. And trust me, guys, this is just the tip of the iceberg. I don't think anyone is connecting the story here. So it talks about the fact that it says we've identified three industrial scale campaigns from deepseek moonshot miniac to you know elicitly extract claude's capabilities to improve their own models basically trying to steal claw for themselves and some would argue this is how China has been able to catch up to the US so of course many of you guys if you've been in the AI space for a while you will know exactly what distillation is so distillation is of course technique where you basically train a less capable model on the outputs of a stronger one. Distillation is a widely used and legitimate training method. For example, they talk about the fact that you know current labs distill their frontier capabilities into small AI models to create cheaper versions for their customers. Google deep think they actually distilled those capabilities down into Google Gemini 3. 1 Pro and that model is a beast. So they talk about here the fact that distillation can also be used for illicit purposes. Competitors can use it to acquire powerful capabilities from other labs in a fraction of the time and at cost that it would take to develop them independently. Now that is pretty true. And before we dive into the fact that yes, there is the glaring question of all the deep dark details of is this true? I'm just going to dive into anthropics report. Now I do think this is a point that most people aren't talking about and anthropic are very correct to talk about this. They say that distillation matters because illicitly distilled models lack necessary safeguards creating significant national security risks. Anthropic and other US companies build systems that prevent states and non-state actors from using AI to for example develop boweapons or carry out malicious cyber activities. models built through elicit distillation are unlikely to retain those safeguards, meaning that dangerous capabilities can proliferate with protections stripped out entirely. This is extraordinarily true and you have to understand that right now, of course, it's not that big of a risk because we're not at those levels where AI can cause that much harm. But think about the capabilities coming in the near future and maybe you'll start to understand why this could become a bigger issue. The upper bound I think is a lot scarier and this is something like you know we hit ASL4 and this you know at anthropic we talked about these safety levels ASL3 is where the models are right now ASL4 is the model is recursively self-improving uh and so if this happens essentially we have to meet a bunch of criteria before we can release a model and so the extreme is that you know this happens or there's some kind of catastrophic misuse like people are using the model to design bio viruses design zero days stuff like this and this is something that we're really actively working on so that doesn't happen. — They talk about the fact that foreign labs that distill American models can then feed these unpredicted capabilities into military intelligence and surveillance systems, enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns and mass surveillance. And if those distilled models are open- sourced, this risk multiplies as these capabilities are spread freely beyond any single government controls. And I do agree with that. If they do manage to distill frontier model capabilities into an open source model, that is of course something that is not good because remember you do have the CBNR risk which is chemical, biological, those kind of risks. Right now of course AI isn't able to help you with that because those models are closed source. But in the few years up and coming to where we're going to start to get open-source models that are relatively good, what happens to the world then? What happens to those risks? How is open source AI going to be treated? Now, they talk about the fact, and this is where I really want you to remember this point because I'm going to come back to this in just 4 minutes, okay? And trust me, there is so much to dive into this video because the deeper I got into the information, the more I could see the full story might be a bit different. So, Anthropic talks about the fact that it's consistently supported export controls to help maintain America's lead in AI, but distillation attacks undermined those controls by

### [5:00](https://www.youtube.com/watch?v=DgsX6NnF_p4&t=300s) Segment 2 (05:00 - 10:00)

allowing foreign labs including those of the Chinese Communist Party to close the competitive advantage that export controls. Now, if we take a look at what Deep Seek did, and remember, I'm just going through the information and then we can get into the real juicy stuff. They said Deepseek essentially had over 150,000 exchanges. There were reasoning capabilities across diverse attacks. There were, you know, many attempts to generate synchronized traffic across accounts, identical patterns, shared payment methods, and coordinated timing suggested load balancing to increase the throughput, improve reliability, and avoid detection. And they said in one notable technique, their prompts asked Claude to articulate the internal reasoning behind a completed response and write out a stepby step, effectively generating a chain of thought training data at scale. Moonshot. They talk about the fact that this company allegedly employs hundreds of fraudulent accounts spanning multiple access pathways. Apparently, it targeted agentic reasoning, tool use, coding and data analysis, computer vision, and computer use. agent development and they said okay and one of the most interesting things that most people didn't talk about it said that they attributed the campaign through request metadata and though they matched that data to the public profiles of senior moonshot staff and in a later phase moonshot used a more targeted approach attempting to extract to reconstruct claude's reasoning traces now why people were concerned about this is because they're basically saying that anthropic had a lot of data on these claude outputs and they were potentially able to use that to track the individuals. But many people are wondering how that data is stored and the privacy issues in that case because if Anthropic has all of that data, some people were wondering that maybe they might be encroaching on privacy laws. And you can see that Minmax also had 13 million exchanges and they identified them through request metadata and infrastructure indicators and they confirmed those timings against their public product roadmap. And they said that they detected this campaign while it was still active before Miniax released the model it was training, giving them unprecedented visibility into the life cycle of distillation attacks from generation from data generation through to model launch. And they said, okay, and this is pretty crazy, that when Anthropic released a new model during Miniax's active campaign, they pivoted within 24 hours, redirecting nearly half of their traffic to capture the capabilities from that native system. And so they said here that prompts like you are an expert data analysis combining statistical rigor with deep domain knowledge. Your goal is to driven. deliver datadriven insights not summaries or visualizations grounded in real data and supported by complete and transparent reasoning. But when the variations of that prompt arrive tens or thousands of time across hundreds of coordinated accounts all targeting the same narrow capability, the pattern becomes clear. Now, one thing I want to talk about, guys, and this was tremendously surprising, was the state of affairs over on Twitter in terms of the absolute response people were talking about when it came to AI. You can see here that people were essentially saying that Anthropic is one of the most hypocritical AI organizations because they are essentially saying that look, we took the internet, took all of that data, took many books, and of course trained an AI model off of that stolen data. But now when another AI company goes onto our AI model and uses those outputs to improve their own AIS, they aren't allowed to do that. So you can see here this post describing the irony and it says we extract nearly all 95. 8% of Harry Potter and the Sorcerer Stone from Claw Sonet essentially saying that look if this model was trained fairly and legally we wouldn't be able to extract 95% of a book in a single run. You can see Yakine here says, "Seal every movie, book, and copyrighted piece of content in existence. No, you can't pay us for tokens. The worst kind of evil is inconsistent evil. " Someone else also said, "No crying in the copyrights casino. " And the public reaction has been one that is really surprising. People are essentially against Anthropic for calling this out. But this is where things start to get interesting because after this came out from Anthropic, I did some deep digging and this is just the tip of the iceberg. So, you know how Anthropic is basically saying they identified distillation attacks. Other companies are stealing from them. Deepshop, Deepseek, Moonshot AI, and Miniax. And people are now saying, "Well, who cares? You guys stole the internet. They're stealing it. You can't really doing anything about it. " But what most people didn't realize is that Google, Deep Mind, and OpenAI actually said one week prior to this that they were experiencing distillation attacks, too. So, three, well, not just three, the three most powerful AI labs are essentially saying that they were under distillation attacks from probably, let's be honest, other AI companies like the Chinese ones mentioned. And you can see here that Google said, and remember this is on February the 12th. It says that Google DeepMind have identified an

### [10:00](https://www.youtube.com/watch?v=DgsX6NnF_p4&t=600s) Segment 3 (10:00 - 15:00)

increase in model extraction attempts or distillation attacks, which were a method of intellectual property theft that violates Google terms of service. and they said they're developing methods to thwart this malicious activity. Now remember this was on February the 12th and this is where the timelines start to pay attention. So you had OpenAI on February the 12th also the same day that Google came out and said they were experiencing distillation attacks. OpenAI essentially published an article or the information came out and it was like OpenAI has warned US lawmakers that Chinese AI startup Deepseek is targeting chat GBT maker and the nation's leading AI companies to replicate AI models and use them for its own training. Sam Alman led OpenAI accused Deepseek of ongoing efforts to free ride the capabilities developed by OpenAI and other front AI labs. So what do we have? Let's piece together the quick timeline. February 12th, you've got Google saying, "We've got distillation attacks coming out. " Then later that day, OpenAI basically says, "Look, Deep Seek are doing it to us, too. " Then a week later, you have Anthropic saying, "Look, we're also looking at this and we have Deepseek, Moonshot, and Miniax. We've got evidence of these three companies trying to steal our models. " But there was also one key thing that was announced recently and this is why maybe this is just completely false but on January the 14th so a month earlier the Trump administration fundamentally shifted the US semiconductor policy allowing the export of advanced AI chips to China under specific conditions. This is what you would call suspicious timing. Okay. And that decision has sparked congressional opposition as it could erase the United States decisive AI advantage over China. Chinese firms have already placed orders of over 2 million H200 chips worth of up to millions and millions of dollars, enabling China to close the gap with USA leading labs. Think about that for a second. All of this information about distillation is being released right as the Trump administration is debating loosening export controls to China. I'm not sure if this is much as a coincidence. So, when you get this news, you have to ask yourself, is this all just a coincidence or is this just lobbying? Do they want to influence the policy debate? And I think understanding that framing helps understanding why these AI labs may have actually dropped this. If Trump is loosening and you know weakening those restrictions, Chinese AI labs are going to have just as much compute as USA AI labs. Which means that if that does happen, then the race is going to be even fiercer in AI development. Which is why I've said when you look at Google releasing their information saying distillation attacks are coming, OpenAI saying we've observed accounts associated with DeepSeek, you know, doing the distillation attacks. And then you've got Anthropic saying, "Look, we've received the same thing. " And that all just happens just a month after the Trump thing. I'm not sure if this is as much as a securities disclosure as much of it is a way for them to try and shape the conversation. Big question on everyone's minds is, are these AI companies lying? because so many people are diving into the numbers and they're saying that OpenAI, Anthropic, Google, all these air companies may be lying about the distillation attack so that they can of course lobby the governments to tighten restrictions. Now that could be true. Of course, they do have incentive to do that. But maybe it is true that those models are actually distilling from them. I wouldn't be surprised if they are when you think about it. Both of these things can be true at the same time. I was reading this uh comment section on Reddit and this is you know not just something pulled randomly because I remember this situation did happen before but it talks about the fact that GLM is very likely to have distilled from Claude. It says if you copy and paste Claude system prompt it behaves identical to Claude which is not something that most models do. It will even tell you about tournament square and that's something that notoriously Chinese models would literally never do. And then of course the other comment says that both GLM and Kimmy claimed to be claimanthropic until very recently. And they said that I'm pretty sure you know Kimmy still does. And on top of China having a repeated track record of blatantly copying America American tech for as long as I've been alive even so much as the guys copying Steve Jobs looks Apple stores, keynotes and devices. Why people are going out of their way to pretend that this isn't likely. You can dislike their companies all you want and not care how much they get ripped off. Let's pretend it's not real. Everyone can suck for different reasons at the same time. And it is true. There was even, I think, a recent Google AI engineer that was actually caught selling trade secrets to China that was recently arrested. We know that China have done this many times. But we'll actually dive into why China actually has their own innovations as well. Now, some people would argue that maybe the AI models are just completely hallucinating. Those prompts and stuff, it doesn't really give evidence of anything of any distillation because when we look at if we, you know, ask Claude Sonet 4. 6 6 or 4. 5 with a Chinese prompt, it actually starts to say that it identifies as Deep Seek. So, it could just be a complete hallucination. And one of the key things that many people were discussing about this entire situation of the distillation attacks is

### [15:00](https://www.youtube.com/watch?v=DgsX6NnF_p4&t=900s) Segment 4 (15:00 - 19:00)

they were basically saying that 16 million conversations is tiny. You can see here this meme says that if you prove to me that you can distill Frontier policy by SFT on less than one Tk, I will close my lab, quit my startup, and come work for you. And basically when you think about it, Claude handles millions of conversations daily. 16 million across multiple companies over an extended period is not really that much data to train on. So either the distillation was much more targeted and surgical than the numbers suggest or the threat is somewhat overstated for political purposes. And this is of course worth exploring because it raises the question of how much capability can you transfer through distillation at that scale. And the numbers suggest that you know if we look at the fact that Claude processes around 25 billion API calls per month. If this is true that such a small amount was able to extract meaningful capability that is actually pretty genuinely alarming that actually means that you don't need massive scale to pull this off effectively. The attacks were surgical and not brute force. But then think about it this also raises the other question. If 16 million exchanges out of hundreds of billions is enough to cause a national security crisis, then that kind of suggests that the underlying vulnerability is much more fundamental than fraudulent accounts. Any sophisticated actor could do this at scale without detection. And the real story might actually be that the attack surface is the public API itself. No amount of account verification or fraud detection is going to fully solve that problem. Which is why we circle back to the argument, the big argument that I've had in my head for the last 3 years is that maybe these frontier models eventually go private. And that might be the endgame of these frontier models from these top tier companies. Think about it, okay? If China can bypass chip controls through a public API using 0. 006% of your traffic without you noticing in real time, export controls alone probably weren't going to be sufficient anyways. and Anthropic is basically arguing that yes, we need export controls and they have a massive hole in them that nobody has solved. So I think this is why I say that the future is going to be super interesting and the next two years of AI development. I don't think they will release the Frontier models. I mean, when you think about the situation right now, GPT4 level stuff is now old and publicly accessible, but the actual frontier models they're using internally for research and the most sensitive applications. There's still increasing evidence that those never see a public API at all. Anthropic's most capable internal models certainly aren't the ones that we are accessing through Claude AI. And this is logic that basically pushes things towards full corporate lockdown. As models become more capable, CBR and risk calculus shifts dramatically. There's probably a capability threshold where releasing via public ABI becomes genuinely indefensible where the bioweapons uplift is so significant that even Anthropic's own safety team are going to refuse to sign that off. And then we might be in a situation where we're not going to be getting those models. So the distillation revelation actually kind of accelerates this timeline because they actually now have proof that public API access can equal capability extraction. And that business case for keeping frontier models internal or maybe just to corporate and enterprise situations just got a lot stronger. Essentially we might be moving towards a two-tiered AI system. Corporate and government entities with vetted access to frontier capabilities. think defense contractors and honestly I probably shouldn't have used that because with what Anthrop picks going through right now I mean pharmaceutical companies financial institutions with proper oversight then a public tier that's always two to three generations behind the actual frontier the public tier may never know how big the gap is because those companies have every incentive to keep that quiet and think about it I think governments may actually push for this anyways regardless of what the companies want when you think about the US military and the intellg ence community. They already have private agreements with some of these labs. And as AI becomes more central to national security, we can expect legislation that essentially mandates a classified tier of AI capability that never touches a public API. And this is of course what creates the kind of AI power concentration that Elon Musk and the open source crowd are screaming about. Except the argument for doing it becomes harder and harder to dismiss as the capabilities increase. When your model can genuinely engineer a pandemic, but it should be open and free, those things start to sound insane. It's one of those situations where both sides can be right about something real, which makes it genuinely difficult to resolve. If you've enjoyed this video, let me know your thoughts down below. It's been Andrew Black. You've been watching the AI grid and I'll see you guys in the next

---
*Источник: https://ekstraktznaniy.ru/video/11414*