Theo - t3.ggMay 5, 202641m

Prime is (mostly) right about AI

TL;DR

The real bottleneck is compute, not greed — Theo agrees with Prime that the AI economics are shifting, but argues Anthropic and Microsoft are tightening usage because they’re short on GPUs and capacity, not because $20 or $40 subscriptions are suddenly the main revenue target.
Per-message pricing is breaking because one 'message' can cost wildly different amounts — Theo points to Cursor’s earlier pricing change and GitHub Copilot’s new model multipliers, noting he had a single Copilot request run for over 2 hours and potentially burn $100+ of compute while counting as just one message.
Anthropic’s Claude Code drama is really about reclaiming compute for enterprise customers — Theo says moving Claude Code off cheaper tiers or shrinking peak-hour limits isn’t primarily about upselling users to $100-$200 plans; it’s about freeing capacity for enterprise contracts, where Anthropic makes the bulk of its money.
Google is not the exception — it may be the most aggressive subsidizer of all — Theo calls Prime’s Google take a miss, arguing Google hands out huge amounts of free inference through AI Overviews and Gemini products, then claws it back quickly when demand spikes, partly because even Google is compute-constrained.
Model costs are getting weirder: frontier training is expensive, but usable intelligence is getting cheaper — He distinguishes expensive pre-training from cheaper post-training/fine-tuning, then uses Artificial Analysis numbers to show GPT-5.5 Medium roughly matches GPT-5.4 X High while cutting run cost from about $2,800 to $1,200.
Enterprise AI usage is on a different economic planet from consumer subscriptions — Theo says a solo dev might get massive value from a $200 Claude plan, but a company like Uber is paying API rates directly, with some engineers reportedly spending more on inference than their salary because enterprise terms require paid contracts, data controls, and zero-retention options.

Summary

Theo sets the frame: Prime is right, but the details matter

Theo opens by saying he’s become much more pro-AI for development over time, and that Primeagen’s new video nails a lot of the big-picture shift. But he wants to push past the usual "AI lovers vs. haters" split and explain where the economics get muddled, especially once people start talking about Microsoft, Google, and Anthropic as if they’re all behaving for the same reason.

The first crack wasn’t Anthropic — it was Cursor

Before getting to Prime’s Anthropic example, Theo rewinds to what he sees as the real start of the problem: Cursor abandoning simple message-based pricing. His point is blunt: charging by message is nonsense when one request might cost a few cents and another might cost many dollars, which is why tools that don’t own their own compute hit the wall first.

Anthropic’s peak-hour experiments reveal the real issue

Theo walks through Anthropic’s off-peak incentives and then its reduced 5-hour session limits during peak hours, arguing these weren’t normal pricing experiments so much as capacity triage. The key image is simple: if you’ve got 100 GPUs and 95 are busy during the workday, those last five matter a lot more than they do overnight.

Subscription plans are marketing, enterprise is the business

This is one of Theo’s main corrections to Prime: Anthropic doesn’t care that much whether users jump from $20 to $100 plans. What it cares about is keeping enough compute free for enterprise customers, because consumer subscriptions are heavily subsidized — in some cases, Theo says, a $200 plan can deliver $2,000 to $5,000 worth of inference.

Why model economics are more subtle than “this release lost money”

Theo agrees with the broad idea that labs can look unprofitable even if individual model generations make money, and he cites Dario Amodei’s framing that each model can be thought of like its own company. But he pushes back on the idea that every weaker or shorter-lived release is a financial disaster, explaining the difference between massively expensive pre-training and often much cheaper post-training, RLVR, and fine-tuning.

Microsoft’s Copilot changes are about capacity, not a cash grab

This is the section that clearly set Theo off. He says Copilot’s new usage rules and even the pause on signups are not evidence of Microsoft suddenly trying to squeeze developers; they’re evidence that Microsoft needs those GPUs for higher-value enterprise demand, and pausing signups is the tell that capacity — not pricing psychology — is driving the decision.

Google is the “free compute” king, not the disciplined outlier

Theo strongly rejects Prime’s claim that Google avoids the same pressure because it has more money. He points to AI Overviews in Google Search, Gemini/"anti-gravity" subsidization, user bans, and usage restrictions as proof that Google gave away enormous amounts of inference, then had to pull back hard — partly because its own product execution is messy and partly because even Google is constrained.

Yes, subsidies are tightening — but useful intelligence is still getting cheaper

Near the end, Theo shifts from capacity doom to a more optimistic nuance: the cost per token can go up while the cost to solve real tasks goes down. Using Artificial Analysis benchmarks, he argues GPT-5.5 Medium matching GPT-5.4 X High at less than half the total run cost is the better lens, and says the market is not heading toward "back to hand-coding" so much as a world where access gets rationed around scarce compute.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

Prime is (mostly) right about AI

Summary

Theo sets the frame: Prime is right, but the details matter

The first crack wasn’t Anthropic — it was Cursor

Anthropic’s peak-hour experiments reveal the real issue

Subscription plans are marketing, enterprise is the business

Why model economics are more subtle than “this release lost money”

Microsoft’s Copilot changes are about capacity, not a cash grab

Google is the “free compute” king, not the disciplined outlier

Yes, subsidies are tightening — but useful intelligence is still getting cheaper

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

Theo sets the frame: Prime is right, but the details matter

The first crack wasn’t Anthropic — it was Cursor

Anthropic’s peak-hour experiments reveal the real issue

Subscription plans are marketing, enterprise is the business

Why model economics are more subtle than “this release lost money”

Microsoft’s Copilot changes are about capacity, not a cash grab

Google is the “free compute” king, not the disciplined outlier

Yes, subsidies are tightening — but useful intelligence is still getting cheaper

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks