US wants Claude all to itself... because it's "TOO DANGEROUS"

21:51

US wants Claude all to itself... because it's "TOO DANGEROUS"

Wes Roth 01.05.2026 37 283 просмотров 1 098 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

FULL DETAILS IN MY NEWSLETTER: https://natural20.beehiiv.com/p/white-house-wants-claude-mythos-for-itself ______________________________________________ My Links 🔗 ➡️ Twitter: https://x.com/WesRoth ➡️ AI Newsletter: https://natural20.beehiiv.com/subscribe Want to work with me? Brand, sponsorship & business inquiries: wesroth@smoothmedia.co Check out my AI Podcast where me and Dylan interview AI experts: https://www.youtube.com/playlist?list=PLb1th0f6y4XSKLYenSVDUXFjSHsZTTfhk ______________________________________________ #ai #openai #llm

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

All right, so Anthropic's Claude mythos is back and this time the White House wants to prevent it from being used. It's a serious national security concern and the government wants to limit how many people get access to it. At the same time, Open Ahai's GPT 5. 5 becomes the second model to do basically what Claude Mythos did. According to AISI, GPT 5. 5 is the second model after mythos to complete a full multi-step cyber attack simulation end to end. All right, so let's unpack this. Soropic wanted to expand Claude Mythos preview, their latest scary model. It's supposed to be extremely good at cyber security, at finding vulnerabilities. They wanted to expand access to 120 new organizations for them to try out Cloud Mythos and see how it works for them. So, Enthropic reportedly wanted to add an additional 70 companies or organizations to their list of companies that were allowed to have Claude Mythos access. So, that would mean that the total number of mentees with Cloud Mythos access would be 120. According to Wall Street Journal, the White House steps in and goes, "Nope, no way. " The reasoning they gave was pretty interesting. The first main reason they cited national security concerns or just security concerns having this model out there for more people to use could potentially cause issues. That makes sense. But there was a second reason that they mentioned that I thought was kind of interesting. They said we don't know if Enthropic has enough compute to be able to serve all these new companies and also us the White House the government. Right? Basically were saying we don't know if Anthropic lets everybody else in. Will that degrade our own ability to use mythos? Now, Anthropic disputes that compute is the limiting factor here, but recently the talk around town, so to speak, have been that Antropic really miscalculated how much compute they needed and is kind of paying the price for it now. So, the key tension is Anthropic allegedly reportedly wants Mythos in the hands of more quoteunquote defenders, people that are able to look at it and use the findings to kind of create defenses for their cyber security, etc. The White House worries that wider access will create more cyber security risk and also potentially limit their own ability to use cloud mythos. Meanwhile, OpenAI is rolling out GPT 5. 5 cyber to their list of critical defenders. You know, people that are going to be using it and testing it to beef up their own cyber security. And the AISI, that's the UK AI security institute. It's a governmentbacked evaluation body that tests these frontier models for dangerous capabilities. You might have recalled that chart that was showing that Claude and Mythos could complete a multi-step cyber attack simulation end to end. So meaning that this could be used to sustain a cyber attack against an enterprise level sort of target. So now there's two models that we know of that are capable of doing this. White House is blocking a mythos from further rolling out. Meanwhile, you know, OpenAI is taking the model that seems to have the same sort of abilities and traits and rolling it out to their list of defenders. So, initially Anthropic had about 50 organizations under list of people who were supposed to be using Cloud Mythos who had access to it. Although apparently there was some group on Discord that had access to Cloud Mythos. They're still investigating that, but for some reason there was I don't know if to call it a leak, but there was a bunch of people out there that had unauthorized access somehow, but officially you had only 50 groups. So, Enthropic was thinking about adding 70 more and bringing it to a total of 120. They're also saying that the compute is not an issue. the recently signed computer deals with Amazon, Google, Broadcom, but that's going to take some time to bring all that online. So, notice that the kind of practical question here isn't just who gets access. It's also who gets priority access when there's limited compute. Does the government get a say in that saying, hey, we want to be using this model, so these 70 other organizations that you're trying to add, they're going to have to wait to make sure that we get priority use have all the compute that we need to use it for whatever for. Now, it's important to understand that there's this kind of narrative out there that Anthropic is just making stuff up when it comes to Claude mythos. That is just a marketing campaign. They're saying, "Oh, it's this scary model and everybody should be scared of it. " When there's other kind of organizations that are confirming what is being said first and foremost, I mean, the fact that the White House is so interested in it, obviously, they're seeing something. All the banks that got access to it, the Fed, they had an emergency meeting talking about like, "Hey, this is kind of a real deal. it could really screw up cyber security for us, right? Obviously, they saw something there that spooked them, right? They're not on Anthropic's PR team trying to drum up business. They saw what this model can do and they got a little scared. At the same time, we have the AISI. So, it's that UK's AI security institute. They have this cyber range called the last ones. So, it's a 32-step simulated corporate network attack. So if you wanted to bring down some corporation, you would have to sort of attack it over a number of steps over a period of time to kind of sustain this attack to try to take down their network. And so AISI is estimated that it would take 20 hours for a human expert to be able to complete this. So Cloud Mythos preview

Segment 2 (05:00 - 10:00)

originally when it was not coming out, but it was first we started hearing about it and it was made accessible to some of these organizations. It completed it end to end in about three out of 10 attempts. And now we're seeing that GPT 5. 5 running the same test completed it in two out of 10 attempts. At the same time, GPT 5. 5 achieved a 71. 4% on their expert level cyber tasks, whereas Cloud Mythos scored a 68. 6. So they're very close with GPT 5. 5 being just a little bit higher. One vulnerability that Mythos found that was highlighted by a lot of people was this 27year-old OpenBSD bug. So something that we thought was very secure that's been around for a long time. Here comes Cloud Mythos and just finds this vulnerability. On the GPT 5. 5 test, a AISI highlighted one reverse engineering challenge that this model solved in 10 minutes and 22 seconds with about a buck and 73 in API usage. So by spending $1. 73 of API, it solved that thing in 10 minutes versus roughly what would take 12 hours for a human expert. So, we're seeing a massive shrinkage of not just how long it takes to find these things, but also the cost. I mean, less than $2 to find these exploits. That's very, very cheap. If this was available to everybody around the world, that could cause issues, obviously. So, why does all this matter? Well, first and foremost, it's it seems like we're crossing into territory where this is no longer treated like SAS, like software as a service. This is treated like controlled national infrastructure. So it's less like Photoshop and more like weaponsgrade uranium is effectively deciding who gets access to this software to these AI technologies when there's constraints or they're deemed to be a national security threat. And so this looks a little bit like a soft licensing regime which is what's been talked about in the past, right? So government handing out licenses for companies that they trust that are able to run that software. That was in the works. That was talked about, right? But no laws were officially passed. There's no government legislative body that is able to give out these licenses. There's nothing formal. There's nothing on the books. But it does seem like that's kind of what is happening. And I think it's important to understand that the capability of these models, they're inherently dual use. So these AI labs, they're referring to the companies that they give access to as defenders, right? So defenders can find these bugs and vulnerabilities and they're able to patch them faster, right? So they're able to kind of build defenses with access to these models. And of course, attackers can find and exploit these bugs faster. Now, there's a lot of takes online that I've read by very smart engineers, software engineers, very smart and highly, highly technically capable people. They go something along the lines of, "Oh, yeah, but they can find those vulnerabilities. This isn't some brand new capability that is now exposed. Smart engineers could, given the right resources, given the right incentives, find these vulnerabilities. " But they're saying, "Well, it's illegal to hack, so therefore those people don't hack. " like what's the big deal? This seems to me like kind of a bad take. And since starting this channel talking to a lot of engineers about, you know, the development of AI, I find that I run into this bad take quite a bit. The reason that a lot of very smart engineers use when thinking about how well AI gets adopted for tasks like cyber security, coding, etc., is they compare its abilities to their own abilities. They'll say things like, "Well, what's the big deal that it can code? I can code better. What's the big deal? They can find these vulnerabilities. I can also find those vulnerabilities. They're comparing AI capabilities at that point to their abilities. Here's the thing. Their abilities usually are, you know, world class. They're part of a very small percentage of the world's population with the ability both, you know, innate intelligence and schooling and practice, experience, etc. that gives them the ability to do that thing. So, to them, it doesn't feel like magic. It doesn't give them anything new. but they might be just 1% of the world. Highly technical coders, engineers, software developers, they're not a large percentage of the population and they're comparing what AI can do to the best available, right? So they're saying the best available engineers can do this thing better than AI. But for the other 99% of the population, they're comparing it to nothing, not being able to do it. Like if you can't code, you can't code. So, if an AI comes along that can code, your ability to code is dramatically increased. If you're a highly paid engineer at one of these tech firms, you're probably not going to go do something illegal like hacking some website or bank or whatever in the hopes of making some money. It's a bad idea. You're better off working and getting paid a lot in a legal way with no threat of jail time. But that reasoning doesn't apply to a lot of people in the world. people that maybe don't speak English as well, don't have the technical capability, that don't make a lot of money, that are in a desperate situation, given access to something like this, they might be tempted to find exploits and to try to make some money. With models like this, there's no more language barrier. Let's say they don't have full command of, let's say

Segment 3 (10:00 - 15:00)

the English language. But with this, they're able to attempt to hack a lot of the English speakaking companies. They might themselves not be able to find these exploits. But with models like this, it seems pretty trivial. And yes, it might be illegal, but they might be in a situation where they're more afraid or desperate of their current situation than they are of some potential jail time. Or they simply might be at a place in the world where that's not as persecuted where the law isn't as strong as it is in you know US or UK etc. So I think the big misunderstanding that happens when people try to think about the capabilities of AI is they ask well how does this improve a worldclass you know experienced engineer? Well, I don't know. Maybe doesn't, maybe it does. Like, I'll leave it up to them to figure it out. My question is, how does it improve your average person? Right now, with GPT 5. 5, I've been using it with Codeex. I'm able to build a lot of stuff that is very, very useful to me. I do it without touching code. We're at a point where you can build some pretty sophisticated software with zero coding and knowledge. This is like the printing press, right? It gives everyone the ability to create or read books. It distributes them to a much wider audience. no longer need to be a scribe, one of the chosen few to be able to write and create books. This democratizes against a very wide population. So Dean Abal posted this on Twitter. He's an AI policy analyst. He worked closely with the government. So he has an understanding of how these things actually tend to shape and work out. So he tends to understand how these decisions actually get made. And he's also very plugged in to the current kind of technological AI debate. So he's up to speed with what's happening. So he's saying if the Wall Street Journal is true then you know we can say that the White House is making the right decision at least in the short term right so as a reaction this might be the right decision but he's saying that long-term this is not going to hold this is just not enough to create long-term safety it's like building a dam against a tsunami his core point I think is that these capabilities are going to diffuse in the next 6 to 18 months if not from the western AI labs then from potentially some of open source Chinese labs or anywhere else really. Mainly right now it's the big western labs that we all know about plus all the Chinese labs that are producing these open source models. If the government continues to restrict the further models that are being rolled out, well, we're functionally a licensing regime. So it's kind of like the idea we talked about, right? So if the government is able to license certain labs or certain companies to have access to this technology, then this is a government licensing regime. That's where we're at. And of course, if it does that, it can't just be based on a gut feeling there. There have to be formal rules. laws that other people can understand. You know, who needs a license, who doesn't, etc. Argues that it's better to have technical safeguards, not just access restrictions. I think one of the more interesting parts of his argument is that technical AI safety can be accelerationist if it lets defenders safely use stronger systems. which means that technical safety people working on the safety of these models, they can actually help accelerate the development of AI if we're able to kind of designate the stronger AI models for defenders to use safely. So defenders safely using these stronger systems to patch everything up while the AI labs accelerate forward in order to figure out how to do things safely, how to add technical safeguards, how to do AI alignment, etc. then it almost like creates the system where AI safety can be used to accelerate progress which is a different way of thinking about it. So I think the big news here especially with GPT 5. 5 saying you know being the second model with the same capabilities as cloud mythos I mean this is suggesting that mythos wasn't a one-off anomaly it's part of a broader frontier AI model trend which means that behind it is going to be more and more and over time they're going to get cheaper access is going to be easier now of course it's important to understand that these are controlled evaluations these aren't real attacks against real companies this is all kind of sandboxed the simulated environments lack active defenses, triggered alerts, don't do anything. There's no defensive tooling. It's a little bit more like a PvE game, not a PvP. It's not PvP. You're not playing against another person that's responding in real time to your actions. And AISI explicitly said that they don't actually know how this would perform against realworld hardened systems, right? Is it going to be very effective or not effective at all? We don't really know. But the time and cost curves are collapsing. Meaning that things are it's getting faster, it's getting cheaper to run these things. So as fast as the capability is improving, keep in mind that whatever seems amazing today few months from now will be faster and cheaper and you know what I mean something better is going to be getting released. Also according to the AISI, it seems that inference compute improves performance, right? So if we throw more hardware at it, GPUs go burr at it, you know, that improves the performance of these models. David Saxs also had counter framing to this. So David Saxs is a venture capitalist. He

Segment 4 (15:00 - 20:00)

is on the all-in podcast. He's an adviser to the Trump cabinet and generally he pushes back against AI dumerism and regulatory overreach and certain safety narratives that sound overexaggerated. So he's saying it's time to demystify mythos. That mythos is not magic. It's not a doomsday device. So, it's one of the first of many AI models that can automate these cyber security tasks. Just like AI is automating coding tasks. He says OpenAI GPT 5. 5 can do the same thing. He expects that all the other models including these leading Chinese models will be at this point within the next 6 months and that these models don't create vulnerabilities. That's actually kind of an important lens to look at it. It's not like they kind of create new attack vectors. They just expose what was there in the first place. they discover bugs that already exist. It's like if we develop a microscope, we're able to see little critters running around. It didn't create them. We're just now able to see them, right? Being able to see bacteria crawling in our skin didn't create a new danger. It just allowed us to study it better, understand it better, create defenses against it better. And so the defender imperative is to get these models into the hands of trusted defenders quickly. So Anthropic is saying this is a big watershed moment. It needs careful release. Saxs framing is a little bit more that this is inevitable. Don't make it mystical. Don't mystify. Don't call it mythos. Just arm the defenders first as soon as you can. Okay. So, a couple points about the politics of this whole thing. So, AI models are rapidly becoming excellent at offensive cyber tasks. I mean, offense/defense, I mean, defense is good, but the offensive capability is really the danger, the risk. Governments want to prevent misuse and labs want vetted defenders to get access before anybody else. That's kind of obvious. That's what most of the public sees behind the scenes. This is what maybe not everybody understands because they're not following. This is kind of a little bit one step deeper, one layer behind. Anthropic has been always the lab most associated with AI safety, AI regulation. They are the AI safety lab. There's some mistrust between the Trump administration and anthropic. their former Biden world connections, their pushing for regulations that there's some mistrust between the two. The Pentagon dispute kicked out into high gear. There's even more of a trust deficit, not just between those two, but even from everybody else kind of observing what's happening. I don't think anybody truly liked what was going on. Now, there were different positions. Some people more a little bit more sympathetic with the Pentagon saying that no outside company should dictate to the American government how their technology is used and no one should have a back door to control how models get used. Of course, I think o overwhelmingly probably more people supported anthropic because they stood up against you know the government the sort of war machine and tried to uphold certain red lines that they didn't want to cross specifically surveillance of American citizens and as well these models being used for autonomous warfare. there was the two big sort of red lines they did not want to get crossed. So with all that being said, it's interesting that both sides kind of need each other anyway. I I've talked about it during that whole conflict that was happening even when they parted ways. I said, "What's the chance that Claude still ends up being used in the government even after all of this? It's too good to be ignored. You want to utilize it. " Now, what's interesting is since that time, the situation kind of shifted a little bit. And you know, I can't see the future. I don't know what's going to be happening next. But here's the thing. Since that moment in time when the whole Pentagon versus entropic thing was happening, Claude was by far the best model at a lot of things. Right now at the end of April, is that still the case? I personally don't think so. I think Openi with their latest release overtook Anthropic. And I'm sure people will argue against that, but I am very close to canceling my Anthropic Max Pro plan. And as of yesterday, I got two subscriptions for OpenAI because I started running into quota for my GBT 5. 5 usage and I was like, I really need a second thing. So now I have two $200 OpenAI plans and just when I'm using Codex, I switch in and out, right? So when I hit a certain quota, I switch to the other plan for whomever this might be useful for. If you use this stuff as much as I do, codecs and such, there's actually a very useful open source thing that you can install that with one line just shows you where you are in terms of your quota both for the session as well as the weekly when it resets. It does it both for opening eyes codeex and claude code very useful and it was made by you can probably guess who it's Peter Steinberger the guy behind open claw this man sleep he's saying he's building AI powered developer tools at ludicrous speed I agree it is ludicrous and specifically the thing that installed is his codeex bar may your tokens never run out so just recommended for anyone who will find this useful now the other interesting point is that this isn't just about the safety policy this is about the very limited amount of compute

Segment 5 (20:00 - 21:00)

that we all have access to. So anthropic signed a new deal with Amazon, Google, Broadcom, but those buildouts take time. It will be a while before that thing comes online, before that compute comes online. If mythos demand scales more rapidly than our available compute, then someone's need gets prioritized above someone else's. And of course here, you know, the federal government doesn't want to be the guy standing in line. By the way, if you're wondering why mythos causes such a compute crunch. So mythos is the biggest sort of model that Anthropic produces, it's a brand new class. So in anthropic sort of designation of how big these models are at the bottom, you have haiku, then you have sonnet, then you have opus. So opus has been the biggest class of models that we've had access to. Well, for entropic, it's the biggest class of models that the public has ever had access to. And mythos is kind of like the next level up. the first model in that class. It's much bigger. So, it's not going to be viable for most use like we do with sonnet or oropus. The amount of compute that it uses will be much bigger. So, if you wanted to, for example, use Mythos to run these security checkups on every major company in America. We simply might not have enough compute to do it. That might take a very long time. So, I think I'll leave it there for now. But just before you go, since you stuck around and watched, this might soon seem like a distant memory because there's a whole other thing that's coming that might break cyber security as we know it. And it's not AI, although there's an ex openey employee and actually ex Google as well that's warning us in no uncertain terms that it's coming within the next few years. I'll talk about that in a future video, but yeah, fun times ahead. So, buckle in. Thank you for watching. Hit thumbs up, subscribe.

Другие видео автора — Wes Roth

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник