Alcreon
Back to Podcast Digest
AskwhoCasts AI··1h 12m

Claude Mythos #2: Cybersecurity and Project Glasswing

TL;DR

  • Anthropic is withholding Claude Mythos because it can autonomously find and exploit zero-days at a new level — Ziv Moshoitz says Mythos found thousands of previously unknown vulnerabilities across every major operating system and browser, including a 27-year-old OpenBSD bug and a 17-year-old FreeBSD RCE that yielded root access without human help.

  • Project Glasswing is effectively a private emergency patching campaign for the global software stack — Anthropic is giving Mythos to launch partners like AWS, Apple, Cisco, Google, Microsoft, Nvidia, Palo Alto Networks, JPMorgan Chase, and the Linux Foundation, with $100 million in credits and $4 million in cash donations to secure critical infrastructure.

  • The jump over prior models looks qualitative, not just incremental — In Anthropic’s reported tests, Mythos hit 0.83 on Cyber Gym versus 0.67 for Claude Opus 4.6, and on Firefox JS shell exploitation it achieved 72.4% full success versus under 1% for Opus, which Ziv frames as a “functional difference in kind.”

  • The real proof isn't benchmarks, it's patched bugs in real codebases — Ziv keeps returning to the fact that major vendors and maintainers are patching vulnerabilities Mythos surfaced, arguing that if bugs sat unfound for 10, 20, or 27 years, that's the cleanest possible evidence this capability wasn't already commonplace.

  • A lot of the online pushback confuses finding bugs with validating known ones — He argues critics citing small open models only showed they could analyze code after Anthropic had already narrowed the search to the right area, while the hard part is autonomous discovery at scale without drowning teams in false positives.

  • This is also a governance and geopolitics story, not just a cyber story — Ziv says the scary counterfactuals are obvious if China, a reckless lab, or an attacker had gotten Mythos first, and he treats the decision to withhold and patch as a rare “take the ring to Mordor” moment that may not repeat next time.

The Breakdown

Mythos isn't coming out — and that's the point

Ziv opens bluntly: Anthropic is not broadly releasing its new top model, Claude Mythos, because its cyber capabilities are too dangerous. Instead, it’s doing a limited release to trusted security partners so they can patch as much critical software as possible before “a very different era” arrives.

Project Glasswing: the corporate cavalry shows up

He introduces Project Glasswing as a coordinated defensive push with more than 40 organizations maintaining critical software, plus heavyweight launch partners like AWS, Apple, Google, Microsoft, Nvidia, Cisco, CrowdStrike, Palo Alto Networks, and JPMorgan Chase. Anthropic says Mythos has already uncovered thousands of zero-days in major operating systems, browsers, and other core software, backed by $100 million in free credits and $4 million in donations.

The model card says the quiet part out loud

Ziv walks through Anthropic’s own summary: Mythos is a “step change” in autonomous vulnerability discovery and exploitation, able to find zero-days in open and closed source code with minimal human steering and often turn them into proof-of-concept exploits. He notes Anthropic’s mitigation plan is basically limited access plus monitoring, because there just aren’t many good options once a model can do this.

Benchmarks are saturated, but Mythos still breaks away

Most CTF-style tests are maxed out now, so Anthropic leans on Cyber Gym and real-world tasks. The standout chart: Mythos scores 0.83 on Cyber Gym versus 0.67 for Opus 4.6, and on Firefox JS shell exploitation it posts 72.4% full success while Opus is below 1%, which Ziv treats as a dead-simple rebuttal to “this is just more of the same.”

The scary part: overnight exploit generation

The red-team report is where Ziv’s tone turns from impressed to rattled. He highlights claims that Anthropic employees with no formal security training could ask Mythos to find an RCE overnight and wake up to a working exploit, including a fully autonomous exploit for a 17-year-old FreeBSD NFS bug that granted root access to unauthenticated internet users.

OpenBSD, browser chains, and why this feels different

He lingers on the memorable examples: a now-patched 27-year-old OpenBSD bug, browser exploit chains involving four vulnerabilities and JIT heap sprays, and paths from an ordinary web page to kernel-level access. His point is that these aren’t toy demos or “stack smashing 101” — Mythos is chaining subtle bugs into real attack paths that experts failed to surface for decades.

The backlash: critics proved validation, not discovery

Ziv spends a long section on the skeptical response, especially claims that cheap open models could reproduce much of Anthropic’s analysis. His rebuttal is that critics were handed the needle after Anthropic had already searched the haystack: small models may help validate a narrowed-down issue, but that’s not the same as autonomously finding severe bugs at scale without absurd false positives.

Bigger than cyber: geopolitics, compute, and the next locked models

From there the video widens out. Ziv says we are entering an era where the best models may stay internal for months, where stolen weights would be a national-security crisis, and where the counterfactual of “what if China got here first?” matters a lot more than people want to admit. He closes by tying Mythos back to the bigger warning: this wasn’t a cyber model per se, it was a stronger coding model, which to him is another flashing sign that AI progress toward automated R&D and potentially existential danger is very much still accelerating.