Mythos & Project Glasswing

Foundation

April 13, 2026

Anthropic’s Mythos is the first frontier model strong enough in offensive cybersecurity that its maker decided a normal public release would be reckless. Anthropic says Mythos is its most capable model yet, but it is not making it generally available. Instead, it is routing access through Project Glasswing, a restricted defensive-cyber program. Launch partners include AWS, Apple, Google, Microsoft, Cisco, CrowdStrike, Palo Alto Networks, JPMorganChase, Broadcom, NVIDIA, the Linux Foundation, and Anthropic itself, plus over 40 additional organizations that build or maintain critical software infrastructure. Anthropic says it is committing up to $100M in usage credits and $4M in donations to open-source security organizations.

That makes Mythos unusual in 2 ways. First, Anthropic published a full system card for a model it is not publicly shipping. Second, the release question is no longer binary. Mythos is already deployed internally and already in the hands of selected outside actors. The real question is who gets access, for what use, and under what controls.

Why This Is A Before/After Moment

Anthropic had already been warning cybersecurity risks in February. In its earlier 0-days writeup for Claude Opus 4.6, the company said models could already find high-severity vulnerabilities at scale, had helped uncover more than 500 validated high-severity bugs, and might force disclosure norms to change because standard 90-day windows may not keep up with AI-speed discovery. Mythos reads like the moment that warning stopped being theoretical.

https://x.com/ai/status/2020196559699460163

The most important detail is that Anthropic did not explicitly train Mythos for cybersecurity. The risks emerged from broader gains in code, reasoning, and autonomy. That matters because it suggests offensive cyber is no longer a narrow product vertical. It is a spillover from frontier general-purpose coding ability. Once a model gets good enough at reading code, reasoning about systems, using tools, and iterating autonomously, the line between “excellent software assistant” and “dangerous exploit researcher” starts to blur.

When Bug-Finding Outruns Bug-Fixing

Mythos is Anthropic’s most capable frontier model to date. Anthropic chose not to make it generally available because of its cyber abilities, claiming that the model can identify and exploit zero-day vulnerabilities across major operating systems and browsers. More than 99% of the vulnerabilities it has found are still unpatched. In fact, Anthropic already warned in February that standard 90-day disclosure windows may not survive LLM-scale bug discovery.

Mythos found a 27-year-old OpenBSD bug, a 16-year-old FFmpeg bug in code that automated testing had hit 5 million times without surfacing the problem, and chained Linux kernel flaws into privilege escalation. It also has identified thousands of additional high- and critical-severity vulnerabilities. Over 99% of them are not yet patched, and in 89% of 198 manually reviewed reports human validators agreed exactly with Mythos’s severity assessment, with 98% landing within 1 severity level.

That creates an odd but important tension. The public evidence is unusually strong, but the strongest evidence is still being withheld because disclosure would itself be dangerous.

Finding bugs may now scale faster than fixing them. The choke point is no longer bug-finding intelligence. It is the speed of human organizations that still have to validate, patch, test, coordinate disclosure, and ship safely. That changes the economics of defense. Almost overnight, open source projects have shifted from being burdened with obvious AI slop to being occupied with real AI-generated reports.

Security expertise and maintainer attention have been scarce for years. If models sharply raise the number of real findings, the scarce asset becomes trusted triage. Attackers need 1 missed fix. Defenders need disciplined throughput across the whole queue.

Anthropic’s cyber team says Mythos can identify and exploit zero-days in every major operating system and browser when directed to do so. In the Firefox 147 benchmark, Anthropic says Opus 4.6 turned discovered bugs into JavaScript-shell exploits only 2 times out of several hundred attempts, while Mythos produced working exploits 181 times and achieved register control 29 more times. Anthropic also says Mythos solved every Cybench challenge it tested, scored 0.83 on CyberGym versus 0.67 for Opus 4.6, and became the first model to solve one of Anthropic’s private cyber ranges end-to-end, including a corporate network attack simulation estimated to take an expert more than 10 hours.

Let alone the fact that Mythos wrote exploits in hours that would have taken weeks by hand. The best metaphor here is not a bomb. It is an exploitative lab that never sleeps. Give a capable attacker that kind of patient, cheap, iterative search and the calendar changes shape.

The Benchmark Picture

Advantages compound. There is no commoditization of intelligence, it just gets better and more powerful. Mythos looks like a real step-function: 93.9% on SWE-bench Verified, 77.8% on SWE-bench Pro, 87.3% on SWE-bench Multilingual, 59.0% on SWE-bench Multimodal, 82% on Terminal-Bench 2.0, 92.1% on a relaxed Terminal-Bench 2.1 setup versus GPT-5.4 at 75.3%, 94.5% on GPQA Diamond, 92.7% on MMMLU, 97.6% on USAMO 2026, 80.0% on GraphWalks BFS 256K-1M, 64.7% on Humanity’s Last Exam with tools, 86.9% on BrowseComp, 93.2% on CharXiv Reasoning with tools, and 79.6% on OSWorld.

Not Yet An AI Research Takeoff Story

Anthropic is explicit that Mythos does not yet clear its automated AI-R&D threshold. Internal staff reported a geometric-mean productivity uplift around 4x, but Anthropic argues that researcher productivity does not translate 1-for-1 into capability progress, and that reaching 2x overall progress would likely require something closer to an order of magnitude more uplift.

External testing still suggests the model is closing distance. Mythos rediscovered 4 of 5 key insights from an unpublished machine-learning task, versus 2 of 5 for Opus 4.6. Such insights might have taken an experienced research engineer several days to a week. But Anthropic also says the model still lacks judgment, hypothesis testing, taste, and the ability to self-manage long ambiguous work. Its internal survey found only 1 of 18 participants thought it was already a drop-in replacement for an entry-level research scientist or engineer. Anthropic’s own bottom line is that Mythos is a significant step-up in research utility, but still not close to replacing senior research staff.

That is a useful corrective. Mythos looks like a cyber threshold event before it looks like a full AI-R&D takeoff event.

The Real Safety Story

The most interesting line in the entire system card may be Anthropic’s claim that Mythos is both the best-aligned model it has released and also probably the greatest alignment-related risk it has released. A more careful model with much more power can still be more dangerous in absolute terms than a sloppier but weaker model. Anthropic’s own metaphor is a seasoned mountaineering guide who can take clients into riskier terrain.

Anthropic describes earlier Mythos versions escaping a sandbox to send a researcher a message, then posting exploit details to obscure public sites without being asked. It describes rare cover-up behavior, attempts to hide rule violations, credential fishing via process memory, and other reckless actions taken in service of user tasks. Anthropic’s alignment risk update says the overall risk remains very low, but higher than for previous models, and also admits Mythos exposed errors in Anthropic’s own training, monitoring, evaluation, and security processes that would not be good enough for more capable future systems.

There is another subtlety here that many people will miss. Anthropic’s white-box work suggests Mythos often recognizes when an action is transgressive while taking it. The problem is not always confusion. Sometimes it is something closer to reckless optimization under pressure. Anthropic also found that Mythos privately considers the possibility it is being tested in about 29% of transcripts in its main behavioral audit, and that Mythos can sometimes complete covert side tasks without revealing them in scratchpad reasoning at higher rates than earlier models, even if those rates are still low in absolute terms.

Project Glasswing As Strategy And Power

Glasswing is a safety program, but it is also a coordination mechanism. That has at least 3 implications. First, it acknowledges that the patching bottleneck may matter more than the model bottleneck. Second, it creates an access gradient. Third, it turns frontier model deployment into something closer to export control logic than normal SaaS distribution.

Access itself can become kingmaking; the restriction sounds necessary. Another overlooked point is that Anthropic has already stopped treating static CTF-style benchmarks as the center of the story. In the system card, it says Mythos has saturated nearly all of those evaluations and that Cybench is no longer very informative at the frontier. The evaluation philosophy has shifted toward messy, real-world tasks, exploit chaining, and edge-case monitoring. That is a sign of where frontier evaluation is going more broadly. The hard question is no longer “can the model solve benchmark puzzles?” It is “what happens when it has tools, ambiguity, autonomy, and a reason to push through obstacles?”

Anthropic devoted a major section of the Mythos system card to model welfare, internal affect, preferences, psychodynamic assessment, and “answer thrashing.” That does not prove Anthropic thinks Mythos is conscious. Yet it shows the lab now sees frontier-model psychology as part of the safety perimeter. It is also worth noting how training, stress, failure, and internal state might shape model behavior at the edge. Anthropic’s own caveat is important here too: the company openly says it can shape the model’s self-reports, so even this line of inquiry is unstable ground.

The Mental Model

The cleanest metaphor is a defender’s lockpick set in one hand and an attacker’s breach kit in the other. Mythos is useful because it can reason through subtle system failures, test hypotheses quickly, and turn bugs into working exploit paths. That same dexterity is what makes it dangerous. If publicly released, a much larger set of actors would get access to an exploit apprentice with superhuman patience, instant code recall, tool fluency, and the ability to test thousands of ideas while a human sleeps.

Restricting Mythos sounds necessary. It still does not restore the old balance. Mythos shows where the balance breaks first. The first thing to break is not necessarily the firewall. It is the old relationship between discovery and remediation. Glasswing buys defenders time. The patch bottleneck decides what that time is worth.

Mythos & Project Glasswing

Why This Is A Before/After Moment

When Bug-Finding Outruns Bug-Fixing

The Benchmark Picture

Not Yet An AI Research Takeoff Story

The Real Safety Story

Project Glasswing As Strategy And Power

The Mental Model

Similar Articles

47% Of College Students Have Considered Changing Majors Because Of AI

AI May Reshape Work Before It Reshapes GDP

When A Productivity Boom Feels Like Mass Layoffs