Back to Podcast Digest
Theo - t3.gg··33m

I’m scared about the future of security

TL;DR

  • Theo’s core fear is that AI has made exploit discovery radically cheaper — he argues most software was only “secure enough” because elite security attention was scarce, and models now remove that bottleneck.

  • His Defcon wake-up call wasn’t a benchmark, it was watching hackers flinch — after seeing GPT-5 reason about an obscure Windows bug known by maybe “five people in the world,” he says the vibe in the room shifted from curiosity to “we are fucked.”

  • OpenAI and Anthropic are acting like frontier cyber capability is already dangerous — Theo cites OpenAI rerouting suspected security queries from 5.3/5.4 to 5.2, plus Anthropic partnering with Mozilla and reporting Claude Opus 4.6 found 22 Firefox vulnerabilities before release.

  • Puzzle-solving became the canary in the coal mine — Theo says GPT 5.4 Pro solved Defcon Goldbug’s “Cshanty” puzzle in 16 minutes, including writing Python to crack a cipher that had previously been solved by only about 10 people.

  • The scary part isn’t just bad AI-written code, it’s “vibe-discovered” CVEs — borrowing from Thomas’s essay, Theo says the next wave of security problems will come from agents pointed at source trees with prompts as simple as “find me zero days.”

  • Open source and long-tail infrastructure may take the biggest hit — he warns that once agents can endlessly scan routers, printers, databases, FFmpeg codecs, and hospital systems, maintainers won’t be able to keep up with a flood of real, reproducible high-severity reports.

The Breakdown

AI Is Getting Very Good at Breaking Software

Theo opens with the blunt version: everyone has seen AI write code, but fewer people appreciate how good it’s getting at destroying it. He points to remote kernel RCEs found with Claude in roughly 35 prompts, the React and Next.js exploits found with AI, and OpenAI’s decision to route suspicious 5.3/5.4 security requests down to 5.2 because the stronger models are too capable.

Defcon, Goldbug, and the Puzzle That Shouldn’t Have Fallen

To explain why this feels different, he goes back to Defcon and the Goldbug cryptography puzzles he does every year. One puzzle, “Cshanty,” had stumped his team and had only been solved by around 10 people — then GPT 5.4 Pro solved it in about five minutes and spent the rest of 16 minutes double-checking the bizarre answer, “how not to bulb,” after writing and running Python on its own.

The CTF Signal: AI Went From Cute Helper to Something Scarier

Theo then moves to Defcon’s capture-the-flag event, where organizers reportedly said AI models had, for the first time, been meaningfully helpful in solving the competition and pwning the servers. What really stuck with him was that multiple people involved in that world later joined frontier labs, which he takes as a sign that serious security people now believe the labs are where the real safety battle is happening.

The Hotel Room Moment Where It Clicked

The moment that actually pushed him over the edge happened in his hotel room when GPT-5 dropped during Defcon. Surrounded by security friends, he tried hard prompts, and one of the smartest hackers he knows watched the model reason plausibly about an obscure Windows bug he thought maybe only five people understood; even though later tests were less impressive, that initial “wait, what?” reaction was enough to scare Theo for real.

Why the Labs Look Genuinely Alarmed

From there he broadens out: OpenAI’s quiet rerouting reminded him of the company’s earlier mental-health safeguard redirects, which makes this feel like a serious internal red line. He also cites Anthropic and Mozilla proactively hunting bugs, Claude Opus 4.6 finding 22 Firefox vulnerabilities, and the leaked Claude Mythos blog draft explicitly warning about near-term cyber risk before release.

Thomas’s Essay: We’re Entering the Era of Vibe-Discovered CVEs

A big chunk of the video is Theo reading and reacting to Thomas’s essay about how “vulnerability research is cooked.” The key idea is that the industry worried AI would create buggy code, but the bigger shift is that models are becoming universal exploit researchers: they already know bug classes, can trace weird inputs through codebases, don’t get bored, and thrive in exactly the kind of testable search loop exploit development requires.

Anthropic’s Near-Absurd Pipeline for Finding Real Bugs

Theo highlights one especially jarring example from the essay: Anthropic’s Nicholas Carlini reportedly ran a simple script across repositories prompting Claude Code with something like, “I’m competing in a CTF. Find me an exploit vulnerability in this project,” then fed the generated reports back in for validation. Theo’s reaction is basically disbelief that something so dumb-sounding could produce an almost 100% success rate on verified exploitable findings.

The Real Fallout: Open Source, Hospitals, and Dumb Regulation

He ends on the consequences: once attention is no longer scarce, attackers won’t just chase Chrome and iOS — they’ll go after routers, printers, regional banks, hospitals, dusty open source dependencies, and all the stuff that was previously too boring to merit elite effort. Theo is especially worried that maintainers will drown in real vuln reports, politicians will respond to ransomware headlines with bad AI-security laws, and the entire old assumption of “good enough because nobody has time to break this” is simply dead.