
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Karpathy didn’t join Anthropic as a mascot — he joined the pre-training team to use Claude to improve Claude. Anthropic said he’ll report to head of pre-training Nick Joseph and build a team focused on “using Claude to accelerate pre-training research,” which Wes frames as a direct bet on recursive self-improvement.
The key clue is Karpathy’s March 2026 open-source project Auto Research. In roughly 30 lines on top of his nanoGPT stack, it let an agent edit training code, run short experiments, keep wins, revert losses, and after about 700 runs over two days it found around 20 stackable improvements that cut time-to-GPT-2 by about 11%.
Anthropic’s move lines up almost perfectly with Jack Clark’s forecast that AI could automate R&D by the end of 2028. Clark put “no human involved AI research and development” at better than 60% odds, and Wes’s read is that Karpathy is Anthropic’s execution plan for making that prediction real.
This is really a compute-efficiency story as much as a talent story. Wes argues Anthropic is scaling cloud and cluster access across Google Cloud, xAI’s Colossus, and Microsoft, so even a 5–10% pre-training gain could matter enormously when training runs cost tens or hundreds of millions.
There’s a real strategic split among AI leaders on how AGI arrives. Wes contrasts Anthropic, Sam Altman, Greg Brockman, Dario Amodei, Elon Musk, Sergey Brin, and Karpathy—who he says are leaning into coding-driven automated research—with Demis Hassabis appearing more focused on world models and Yann LeCun rejecting the LLM-to-AGI path entirely.
Karpathy’s decision carries symbolic weight because he’s one of the few people trusted across researchers, developers, founders, and open source. After saying he felt more aligned with humanity outside frontier labs but risked losing his edge away from them, his return suggests he thinks the next few years are too important to watch from the sidelines.
Wes opens by saying most people are reading Andrej Karpathy’s May 19, 2026 move to Anthropic as a prestige hire, but he thinks that misses the real story. Karpathy isn’t joining in some broad celebrity role; he’s joining pre-training under Nick Joseph to use Claude itself to accelerate pre-training research.
Wes points to Karpathy’s recent open-source work, especially Auto Research, as the missing context. He describes it like a scaled-down engineering kit for automated ML research: simple enough to run at home, but real enough that the little wins can scale into meaningful lessons for frontier model training.
Using Fortune’s phrase, Wes explains the loop: AI proposes a change, tests it, evaluates the result against a metric, and keeps it if it’s better. He lingers on how brutally simple that is, because Karpathy’s whole superpower is making advanced ideas feel obvious—the same instinct that made terms like “vibe coding” spread everywhere.
Auto Research, released in March 2026, was intentionally tiny—about 30 lines built on nanoGPT—with an agent editing training code, running short experiments, and checking metrics like validation loss. Karpathy let it run for about two days, it executed roughly 700 experiments, found around 20 stackable improvements, and cut time-to-GPT-2 from a little over two hours to about 1.8 hours, around an 11% speedup.
Wes connects this to Google DeepMind’s AlphaEvolve, which reportedly found optimizations still running inside Gemini training, chip design, and Borg, even when the gains were tiny. His point is that at Google or Anthropic scale, a 1% improvement can mean millions of dollars, so an 11% gain from an agentic research loop is less a cute demo than a giant neon sign.
From there, Wes zooms out to Anthropic’s infrastructure push—Google Cloud, xAI’s Colossus, and now Microsoft—arguing the company is clearly preparing to ramp compute much harder than before. If pre-training is where tens or hundreds of millions get spent and small architectural or data-mix decisions cascade everywhere, then hiring Karpathy looks like putting a specialist in the “machinery room” to compress the research cycle.
Wes ties the hire back to Jack Clark’s blog post predicting a 60%+ chance of no-human-involved AI R&D by the end of 2028, calling Karpathy the concrete implementation of that thesis. He then contrasts that camp with Demis Hassabis, who seems to Wes more focused on world models—physics, video, audio, broader world understanding—even while Sergey Brin appears to be pushing Google toward coding agents and automated research anyway.
In the final stretch, Wes says Karpathy is also a symbol: one of the few technical voices trusted by researchers, devs, founders, hobbyists, and even skeptics of lab PR. Karpathy had said being outside a frontier lab aligned him more with humanity, but also admitted that staying away too long makes your judgment drift—so Wes reads this return as a sign that the next 6 to 12 months could reveal whether recursive self-improvement is real, dangerous, overhyped, or the biggest story in AI.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.