
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
The real bottleneck is compute, not greed — Theo agrees with Prime that the AI economics are shifting, but argues Anthropic and Microsoft are tightening usage because they’re short on GPUs and capacity, not because $20 or $40 subscriptions are suddenly the main revenue target.
Per-message pricing is breaking because one 'message' can cost wildly different amounts — Theo points to Cursor’s earlier pricing change and GitHub Copilot’s new model multipliers, noting he had a single Copilot request run for over 2 hours and potentially burn $100+ of compute while counting as just one message.
Anthropic’s Claude Code drama is really about reclaiming compute for enterprise customers — Theo says moving Claude Code off cheaper tiers or shrinking peak-hour limits isn’t primarily about upselling users to $100-$200 plans; it’s about freeing capacity for enterprise contracts, where Anthropic makes the bulk of its money.
Google is not the exception — it may be the most aggressive subsidizer of all — Theo calls Prime’s Google take a miss, arguing Google hands out huge amounts of free inference through AI Overviews and Gemini products, then claws it back quickly when demand spikes, partly because even Google is compute-constrained.
Model costs are getting weirder: frontier training is expensive, but usable intelligence is getting cheaper — He distinguishes expensive pre-training from cheaper post-training/fine-tuning, then uses Artificial Analysis numbers to show GPT-5.5 Medium roughly matches GPT-5.4 X High while cutting run cost from about $2,800 to $1,200.
Enterprise AI usage is on a different economic planet from consumer subscriptions — Theo says a solo dev might get massive value from a $200 Claude plan, but a company like Uber is paying API rates directly, with some engineers reportedly spending more on inference than their salary because enterprise terms require paid contracts, data controls, and zero-retention options.
Theo opens by saying he’s become much more pro-AI for development over time, and that Primeagen’s new video nails a lot of the big-picture shift. But he wants to push past the usual "AI lovers vs. haters" split and explain where the economics get muddled, especially once people start talking about Microsoft, Google, and Anthropic as if they’re all behaving for the same reason.
Before getting to Prime’s Anthropic example, Theo rewinds to what he sees as the real start of the problem: Cursor abandoning simple message-based pricing. His point is blunt: charging by message is nonsense when one request might cost a few cents and another might cost many dollars, which is why tools that don’t own their own compute hit the wall first.
Theo walks through Anthropic’s off-peak incentives and then its reduced 5-hour session limits during peak hours, arguing these weren’t normal pricing experiments so much as capacity triage. The key image is simple: if you’ve got 100 GPUs and 95 are busy during the workday, those last five matter a lot more than they do overnight.
This is one of Theo’s main corrections to Prime: Anthropic doesn’t care that much whether users jump from $20 to $100 plans. What it cares about is keeping enough compute free for enterprise customers, because consumer subscriptions are heavily subsidized — in some cases, Theo says, a $200 plan can deliver $2,000 to $5,000 worth of inference.
Theo agrees with the broad idea that labs can look unprofitable even if individual model generations make money, and he cites Dario Amodei’s framing that each model can be thought of like its own company. But he pushes back on the idea that every weaker or shorter-lived release is a financial disaster, explaining the difference between massively expensive pre-training and often much cheaper post-training, RLVR, and fine-tuning.
This is the section that clearly set Theo off. He says Copilot’s new usage rules and even the pause on signups are not evidence of Microsoft suddenly trying to squeeze developers; they’re evidence that Microsoft needs those GPUs for higher-value enterprise demand, and pausing signups is the tell that capacity — not pricing psychology — is driving the decision.
Theo strongly rejects Prime’s claim that Google avoids the same pressure because it has more money. He points to AI Overviews in Google Search, Gemini/"anti-gravity" subsidization, user bans, and usage restrictions as proof that Google gave away enormous amounts of inference, then had to pull back hard — partly because its own product execution is messy and partly because even Google is constrained.
Near the end, Theo shifts from capacity doom to a more optimistic nuance: the cost per token can go up while the cost to solve real tasks goes down. Using Artificial Analysis benchmarks, he argues GPT-5.5 Medium matching GPT-5.4 X High at less than half the total run cost is the better lens, and says the market is not heading toward "back to hand-coding" so much as a world where access gets rationed around scarce compute.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.