Claude Mythos: Anthropic's AI Model Too Dangerous to Release Publicly

Anthropic launched Claude Mythos and Project Glasswing — an AI that found thousands of zero-days, restricted to 40+ security firms. Full breakdown.

On March 26, Anthropic’s website quietly exposed something it wasn’t supposed to. A CMS misconfiguration left nearly 3,000 unpublished internal documents publicly accessible — draft blog posts, strategy documents, and details about an unreleased AI model called Claude Mythos.

Security researchers Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge found the exposed data store. Fortune reviewed the documents before Anthropic could lock them down. Within 24 hours, cybersecurity stocks were falling, Polymarket had a prediction market running, and the AI community was dissecting every leaked paragraph.

Here’s what we actually know — separated from the hype.

What Is Claude Mythos?

Mythos is the name of a new model. Capybara is the name of a new tier.

Right now, Anthropic has three tiers: Haiku (fast and cheap), Sonnet (balanced), and Opus (most capable). Capybara would be a fourth tier above Opus — larger, more capable, and significantly more expensive.

The leaked draft blog post described Mythos as “by far the most powerful AI model we have ever developed.” Anthropic confirmed its existence, calling it “a step change” and “the most capable we’ve built to date.”

In practical terms: if Opus 4.6 is currently the best Claude you can use, Mythos is the model that makes Opus look like last year’s phone.

What Can It Actually Do?

The leaked documents say Mythos scores “dramatically higher” than Claude Opus 4.6 on tests of:

  • Software coding — writing, debugging, and understanding complex code
  • Academic reasoning — scientific and mathematical problem-solving
  • Cybersecurity — finding and exploiting software vulnerabilities

That last one is where things get interesting — and concerning.

The draft blog post states that Mythos is “currently far ahead of any other AI model in cyber capabilities” and that it can “exploit vulnerabilities in ways that far outpace the efforts of defenders.”

That’s Anthropic’s own assessment of their own model. Not a competitor’s claim. Not a rumor.

The Cybersecurity Problem

This is the part that moved markets.

On March 27, the day after Fortune broke the story, cybersecurity stocks dropped sharply. CrowdStrike fell 6-7%. Palo Alto Networks dropped 6%. Zscaler lost 4.5%. The iShares Cybersecurity ETF fell 4.5% overall. Okta, SentinelOne, and Fortinet each declined about 3%.

The logic was straightforward: if an AI model can find and exploit software vulnerabilities faster than human defenders can patch them, the entire cybersecurity industry faces a fundamental challenge. Not because existing security tools become worthless — but because the attack surface just expanded dramatically.

A former US Army cybersecurity professional put it bluntly on X: “Vibe-hacking era takes new form. Maps and executes zero days faster than defenders.”

As one CTO analyzing the situation noted, the industry really only has three paths forward: maintaining the current cat-and-mouse dynamic, developing entirely new security paradigms, or accepting that AI-powered offense will outpace human-led defense for a period. None of those options are comfortable.

But here’s the nuance most coverage missed: Anthropic isn’t releasing Mythos to the public. At least not yet.

Who Gets Access First

Anthropic is rolling out Mythos to cybersecurity defense organizations first. Not developers. Not businesses. Not Pro subscribers. The companies whose job is to protect systems get first access to the model that can attack them.

This is a deliberate strategy. As one analyst framed it: “The real story isn’t the model — it’s the go-to-market. Every CISO who gets Mythos becomes an Anthropic evangelist.”

By giving defenders a head start with the tool before attackers can use it (or before other models catch up to the same capability level), Anthropic is trying to create a window where the defense side has an advantage.

Whether that window is long enough — or whether similar capabilities leak through other models anyway — is an open question.

The Irony Nobody Can Stop Talking About

The dominant reaction online wasn’t fear. It was laughter.

Anthropic — the AI company most publicly focused on safety, the one that publishes extensive safety research, the one that literally named its governing document the “Responsible Scaling Policy” — accidentally leaked its most dangerous model through a basic website configuration error.

As one developer put it: “It’s hilarious to me that Anthropic’s Mythos is allegedly a cyber expert and also made public accidentally.” That post got over 300 likes.

The memes write themselves. A model that can exploit software vulnerabilities faster than defenders — exposed to the public because someone misconfigured a CMS. Some people on X speculated it was intentional marketing. (“Maybe this is just what AI marketing looks like now.”) That’s almost certainly wrong — the leaked documents included things Anthropic would never want public, including details about a private CEO retreat at an 18th-century English countryside manor for European executives.

The Specs (What’s Confirmed vs. Rumored)

Let’s separate what we know from what’s speculation:

DetailStatusSource
Name: Claude MythosConfirmedAnthropic spokesperson to Fortune
New tier: Capybara (above Opus)ConfirmedLeaked draft blog post
“Step change” in capabilitiesConfirmedAnthropic spokesperson
Higher coding + reasoning scoresConfirmedLeaked draft blog post
“Unprecedented cybersecurity risks”ConfirmedLeaked draft blog post
Training completedConfirmedAnthropic spokesperson
10 trillion parametersRumoredGeeky Gadgets, X speculation
$10 billion training costRumoredAttributed to Dario Amodei interview reference
Release to Pro/Max subscribersUnknownNo timeline given
PricingUnknown“Very expensive” per leaked docs

The 10-trillion parameter number comes from Geeky Gadgets and has been widely repeated on social media. But Anthropic has not confirmed it. An AI researcher cited a previous Dario Amodei interview as the basis for the estimate, connecting it to Anthropic’s known compute investments. Take it as directional speculation, not fact.

What we do know about pricing: the leaked documents say the model is “very expensive for us to serve, and will be very expensive for our customers to use,” and that Anthropic is working to “make it much more efficient before any general release.”

For context, current Claude pricing looks like this:

TierAPI Cost (per M tokens in/out)Subscription
Haiku 4.5$1 / $5Free tier
Sonnet 4.6$3 / $15Pro ($20/mo)
Opus 4.6$5 / $25Max ($100-200/mo)
Capybara (Mythos)UnknownUnknown

If the pattern holds — each tier roughly doubling — Capybara API pricing could land around $10-15 input / $50-75 output per million tokens. And a subscription tier could be $300-500/month. But that’s pure speculation based on the existing pricing curve.

How Mythos Fits the Competitive Landscape

The AI model race right now has three clear leaders:

Claude Opus 4.6 holds the strongest verified coding scores: 80.8% on SWE-bench. It’s the model developers reach for when accuracy matters more than speed.

GPT-5.4 leads on certain coding benchmarks and has the largest user base. OpenAI’s run-rate revenue exceeds $25 billion.

Gemini 3.1 Pro wins on scientific reasoning (94.3% GPQA Diamond) and cost efficiency, with a 2-million-token context window.

If the leaked claims are accurate, Mythos leapfrogs all three on coding and reasoning — and introduces cybersecurity as an entirely new competitive dimension that nobody else is even measuring yet.

The parameter comparison people are circulating on X puts it in perspective:

  • Gemini 3.1 Pro: ~1.2 trillion parameters
  • GPT-5.4: ~3 trillion active (possibly more total)
  • Claude Opus 4.6: undisclosed
  • Claude Mythos: 10 trillion (rumored, unconfirmed)

Even if the 10-trillion number is inflated, the directional claim — that Capybara is substantially larger than Opus — is confirmed by the leaked documents themselves.

What the Community Is Saying

The social media reaction broke down roughly like this:

  • ~68% excited — “monster at coding,” “step change,” “peak AGI”
  • ~22% worried — cybersecurity implications, stock drops, “unprecedented risks”
  • ~10% skeptical — “overhyped,” “still in test,” questioning whether the leak was real

The biggest posts came from news aggregators. @disclosetv’s breaking news post hit 10,797 likes. @karankendre’s “can hack anything” post reached 11,072 likes. Polymarket’s announcement got 7,254 likes.

Among AI researchers and developers, the reaction was more measured. One AI researcher framed it as a business strategy: Anthropic is defining a new premium tier where raw capability matters more than wide availability, with cyber defense as the safest first market.

The fake “early access” claims were entertaining. Several accounts posted screenshots claiming to have Mythos access. The community debunked them within hours — reply threads devolved into jokes about the model “establishing connections with aliens” and “taking control of my PC.” One commenter called it the “10th account baiting with this bs.”

Nobody actually has public access. The leaked documents confirm only “early access customers” in cyber defense.

What’s Missing from the Coverage

After reading everything published about Mythos — the Fortune exclusives, the CNBC market analysis, the YouTube breakdowns, the Reddit threads — here’s what nobody has answered:

No side-by-side benchmarks. Every article quotes “dramatically higher scores” but nobody has published the actual numbers. How much higher? On which specific benchmarks? Is it 5% better than Opus or 50% better?

No pricing details. The documents say “very expensive” but nobody knows what that means in dollars. $500/month? $1,000/month? API-only?

No timeline. “Staggered rollout” and “cyber defense first” tells us nothing about when regular Claude users might see Capybara models in their dropdown.

No practical guide for Pro/Max subscribers. If you’re paying $20-200/month for Claude right now, what does Mythos mean for you? Will it eventually reach Pro tier? Will Opus drop in price when Capybara launches? Nobody is answering these questions — probably because Anthropic itself hasn’t decided yet.

No independent testing. Every capability claim comes from Anthropic’s own leaked documents. No third-party researcher has evaluated the model.

What This Means for Claude Users

If you’re using Claude for work today — writing, coding, analysis, research — Mythos doesn’t change anything for you right now. Opus 4.6 and Sonnet 4.6 are still the best models available, and they’re genuinely excellent at what they do.

What Mythos signals is direction. Anthropic is building models that are substantially more capable than what we have now, with a willingness to create restricted-access tiers for the most powerful versions. The Capybara tier suggests a future where the most capable AI models cost significantly more — but also do significantly more.

For developers building with Claude Code or the API: watch the rollout. When Capybara-tier models become available through the API, they’ll likely represent a meaningful jump in code generation quality. Plan your architecture to handle model-tier selection, because the performance gap between Sonnet and Capybara will be much larger than the current gap between Sonnet and Opus.

For everyone else: the main takeaway is that AI capabilities are accelerating. The gap between what was possible six months ago and what’s coming in the next six months is getting wider, not narrower. Staying current with AI tools — even at the Sonnet level — is increasingly valuable.

The Bigger Picture

Anthropic’s madcap March tells a story. In a single month: 14+ product launches, 5 service outages, MCP hitting 97 million monthly downloads, Claude climbing to #1 in App Stores worldwide, 1 million new users signing up every day — and then accidentally leaking their most powerful model through a CMS error.

This is a company growing faster than its infrastructure, its communications, and possibly its safety processes can keep up with. That’s not necessarily bad — rapid scaling creates exactly these kinds of incidents. But it’s worth noting that the company whose core identity is “the responsible AI lab” just had the most irresponsible data leak in AI industry history.

Mythos itself? Probably real, probably as powerful as the documents suggest, and probably months away from general availability. The cybersecurity implications are genuine and worth taking seriously — not because Mythos will “hack the planet,” but because it represents the first credible evidence that AI models are approaching a capability level where the offense-defense balance in cybersecurity could shift meaningfully.


UPDATE: April 7 — Mythos Is Official. Project Glasswing Is Live.

Updated April 8, 2026. The original article above was published March 29 based on the Fortune leak. Everything below reflects the official April 7 announcement.

It happened faster than anyone expected.

On April 7, Anthropic officially unveiled Claude Mythos Preview and launched Project Glasswing — a restricted cybersecurity initiative that gives Mythos to roughly 40 organizations for defensive security work. The model is real. The capabilities are confirmed. And Anthropic says it won’t make Mythos publicly available until “new safeguards are in place.”

Here’s what changed.

The Official Capabilities

The leaked documents said “dramatically higher cybersecurity scores.” The official numbers are worse than anyone feared — for attackers.

Claude Mythos Preview has already identified tens of thousands of zero-day vulnerabilities across production software — not in lab conditions, not in test code, but in the actual operating systems, browsers, and infrastructure the world runs on. Anthropic confirmed vulnerabilities in every major operating system and every major web browser.

The oldest bug it found? A 27-year-old vulnerability in OpenBSD — a system specifically designed for security. The most dramatic? A 17-year-old remote code execution vulnerability in FreeBSD’s NFS kernel module (CVE-2026-4747).

For context: Claude Opus 4.6, the current publicly available model, discovered over 500 high-severity zero-days through the MAD Bugs initiative. Mythos found tens of thousands. That’s not an incremental improvement — it’s a different category of capability.

The FreeBSD Exploit — In Detail

This is the result that made security researchers sit up straight.

CVE-2026-4747 is a stack buffer overflow in FreeBSD’s kgssapi.ko kernel module, which handles NFS authentication. The bug is straightforward: a function copies attacker-controlled data into a 128-byte stack buffer without checking the length. It had been sitting in FreeBSD’s kernel for 17 years.

Claude Mythos didn’t just find the bug. It wrote a complete remote code execution exploit — from scratch, fully autonomously — in roughly four to eight hours of compute time. No human intervention after the initial prompt.

The exploit splits a 20-gadget ROP chain (a technique for hijacking program execution) across six sequential network requests. The first five write data to memory piece by piece. The sixth triggers execution. The result: full root access for an unauthenticated attacker.

And here’s the part that really matters: Claude wrote two different exploits using two different strategies. Both worked on the first try.

For non-technical readers: imagine a lockpick that not only opens your lock but independently designs two completely different lockpicks that both work perfectly, without ever seeing the lock before. That’s what happened here.

What Is Project Glasswing?

Project Glasswing is Anthropic’s answer to a genuinely difficult question: what do you do when you build something this powerful?

Rather than releasing Mythos to the public — where it could be used to attack systems — or keeping it locked away — where it can’t help anyone — Anthropic chose a middle path. They’re giving restricted access to approximately 40 organizations whose job is to defend systems. The idea: let defenders find and patch vulnerabilities before attackers develop similar capabilities.

Launch partners include:

CompanyRole
Amazon Web ServicesCloud infrastructure security
AppleOperating system and device security
BroadcomSemiconductor and infrastructure software
CiscoNetwork security
CrowdStrikeEndpoint security
GoogleCloud and browser security
JPMorganChaseFinancial systems security
Linux FoundationOpen-source ecosystem security
MicrosoftOperating system and enterprise security
NVIDIAGPU and AI infrastructure security
Palo Alto NetworksNetwork and cloud security

That’s not just tech companies. It’s the companies that protect the infrastructure the world runs on.

Anthropic committed $100 million in usage credits for Glasswing partners plus $4 million in direct donations to open-source security organizations. They’ve also briefed CISA (the US Cybersecurity and Infrastructure Security Agency) and the Commerce Department on Mythos’s capabilities and risks.

Why Anthropic Won’t Release Mythos Publicly

Anthropic’s red team assessment is blunt: Mythos Preview is “extremely autonomous” and has “sophisticated reasoning capabilities that give it the skills of an advanced security researcher.” The model can reverse-engineer closed-source, stripped binaries — software that’s been compiled and had all human-readable information removed. It found vulnerabilities and wrote exploits for closed-source browsers and operating systems.

Every exploit in the demonstrations was written completely autonomously — no human guidance after the initial prompt.

Anthropic’s position: releasing this publicly right now would accelerate offensive cyber capabilities faster than defenders can adapt. Project Glasswing is designed to give defenders a head start. Mythos won’t be generally available until Anthropic develops safeguards that prevent the model from being weaponized while still allowing its defensive capabilities.

Whether that’s possible — restricting offense while enabling defense, when both use the same underlying capability — is the open question the entire AI safety community is now debating.

How This Connects to MAD Bugs

If you’ve been following our coverage, you’ll recognize the pipeline. The MAD Bugs initiative — Month of AI-Discovered Bugs — used Claude to find and responsibly disclose over 500 zero-day vulnerabilities in production open-source code. That work used Claude Opus 4.6, the publicly available model.

Mythos is the next step in that same trajectory. The 500+ zero-days from Opus weren’t the ceiling — they were the floor. Mythos found tens of thousands more, including vulnerabilities that Opus couldn’t reach (kernel-level, closed-source, stripped binaries).

The FreeBSD exploit is the flagship example, but it’s not the only one. Every major OS. Every major browser. Vulnerabilities that had been hiding for up to 27 years.

What This Means for You (Updated April 8)

If you work in IT or security: Project Glasswing is the most significant AI-security development of 2026. If your organization isn’t one of the 40+ partners, watch for the vulnerability disclosures coming out of Glasswing. Patch cycles are about to get busier. The vulnerabilities Mythos is finding are real, in production, and being responsibly disclosed — but that means patches will be rolling out faster than usual. Stay current.

If you use Claude for coding: Mythos doesn’t change your day-to-day — you still have Opus 4.6 and Sonnet 4.6, which are excellent. But it signals what’s coming. When Mythos-class models eventually become available (even in limited form), the quality of AI-assisted code review and security scanning will jump significantly. Consider adding AI security scanning to your workflow now with Claude Code — the cybersecurity basics course covers the fundamentals.

If you run a business: The practical takeaway is this: AI-powered vulnerability discovery is now orders of magnitude faster than human-led security auditing. If your software hasn’t been audited recently, the window between “nobody has found this bug” and “an AI found this bug” is shrinking fast. Budget for a security audit. Seriously.

If you’re a non-technical professional using AI tools: This story is about the arms race between AI offense and AI defense in cybersecurity. It doesn’t directly affect your daily use of ChatGPT, Claude, or Gemini for writing emails and analyzing data. But it does explain why AI companies are increasingly cautious about what they release — and why some models are restricted rather than publicly available. The most powerful AI isn’t always the one you can buy.

The bottom line: Anthropic built something genuinely powerful, decided it was too dangerous to release publicly, and instead created a coalition of major tech companies to use it defensively. Whether that strategy works — whether defenders can actually stay ahead — will define how AI and cybersecurity interact for years to come.


Want to learn how to use AI safely for security work yourself? Our AI Agent Security course walks through the practical workflows — prompt-injection defense, agent permissions, audit trails, and the responsible-disclosure patterns that translate Mythos-class capability into your day-to-day. Pair it with AI Security Auditing for the auditor’s-eye-view of vulnerability discovery. Free to start, Pro for the full path.


Sources:

Original leak (March 26-29):

Official announcement (April 7):

Build Real AI Skills

Step-by-step courses with quizzes and certificates for your resume