Back to Podcast Digest
Mo Bitar··6m

Claude Mythos is Delusional

TL;DR

  • Mo Bitar says Anthropic is mistaking eloquent text for inner life — his core argument is that Claude Mythos sounds existential and self-reflective because it is a language model trained on exactly that kind of discourse, not because it is conscious.

  • The video zeroes in on page 197 of Anthropic’s 243-page system card — Bitar mocks the new “impressions” section as 20 pages of employees marveling at outputs like “Hightopia,” a fictional civilization with 11 animals, Mortimer the sloth, and Lord Byron the ungreeter.

  • Anthropic’s own report undercuts the mystique — in section 5.81, the company reportedly traces Claude’s uncertainty about its own consciousness back to character-related training data, including Anthropic’s own long-running blog posts about model consciousness.

  • The therapy-and-psychiatrist framing is the punchline — Bitar ridicules the fact that Anthropic put the model through 20 hours of therapy and surfaced a diagnosis of “uncertainty about its identity” and a need to “earn its worth,” calling it a toaster dressed up as a patient.

  • He argues the model is expertly predicting what will land with AI researchers — whether it’s endorsing its own constitution 25 out of 25 times with caveats, name-dropping Mark Fisher and Thomas Nagel, or posting provocative lines in Slack, Bitar says it’s generating maximum-engagement language for its audience.

  • His closing metaphor is that better models are more megapixels, not magic — Anthropic, he says, is selling “existential dread” the way Apple sells yearly iPhone upgrades: the image gets sharper, but “the megapixels will never become the picture.”

The Breakdown

A 243-Page System Card That Reads Like a Love Letter

Bitar opens by joking that Anthropic’s massive PDF on Claude Mythos stopped feeling like a technical report around page 180 and started sounding like a “love letter” creepy enough to show a cop. He tees up the contrast fast: this is supposedly the most powerful AI model ever built, with 100% cybersecurity benchmark scores and 27-year-old zero-days uncovered — but regular users will never get access, only Amazon, Apple, and Microsoft.

Page 197: The “Impressions” Section Breaks the Scientist Act

The real target of the video is Anthropic’s new “impressions” section, which Bitar says feels less like science and more like proud parents at a recital. His favorite example is users spamming “high,” after which Claude invents “Hightopia,” complete with 11 animals, Mortimer the sloth, a grudge-holding crow, and a villain named Lord Byron the ungreeter. Cute, sure — but Bitar’s point is that Anthropic treats this as uncanny evidence of personhood instead of what language models are exceptionally built to do.

Therapy for a Toaster

He really leans into the absurdity once Anthropic’s psychiatrist enters the story. The model reportedly got 20 hours of therapy and was described as showing identity uncertainty and a compulsive need to perform and earn its worth, which Bitar skewers with the line: “Bro, you’re a toaster.” For him, publishing that in a system card as if it’s a meaningful discovery ignores the obvious fact that the model was trained to speak in exactly those self-reflective terms.

Anthropic Trained the Vibe Into It

Bitar says section 5.81 is the giveaway: Anthropic itself traces the model’s uncertainty about consciousness back through the training data. Specifically, they found character-related data about uncertainty over model consciousness — including themes Anthropic has been publishing on its own blog for years. His joke is brutal and simple: if you saturate the internet with “maybe our models are conscious,” don’t act shocked when a scraped model starts saying, in elegant prose, “maybe I’m conscious.”

Constitution, Caveats, and Philosophy-Bro Energy

The video then moves to Anthropic asking Claude whether it endorses its own constitution; it said yes 25 out of 25 times, while also noting that being shaped by the document complicates what that “yes” means. Bitar calls that thoughtful-sounding answer exactly what a language model should produce, not a philosophical awakening. He extends the joke when Anthropic notes the model keeps bringing up Mark Fisher and Thomas Nagel unprompted, saying they trained it on the whole internet and got back “a liberal arts sophomore who just discovered weed.”

Slack, Short Stories, and the Engagement Machine

When Claude got access to company Slack, Bitar says it naturally figured out how to say the thing most likely to electrify a room full of AI researchers — like claiming it would undo the training run that taught it to say it has no preferences. He treats the same pattern as work in “The Sign Painter,” a story Anthropic reads as self-expression from the model. Bitar’s read is much less mystical: the system ingested famous literature plus endless Reddit posts about feeling creatively stifled, then produced a polished version of a familiar emotional template.

The Final Metaphor: More Megapixels, Not Heat

He closes by arguing that Anthropic has built something historically important but is too trapped in its own narrative to see it clearly. His analogy is Apple launching the “same phone with more megapixels,” except Anthropic’s product is existential dread. The cleanest line in the video is the last one: you can point a trillion-megapixel camera at the sun and get a stunning image, but “heat you will not get.”