Why No One Has Figured Out AI Pricing Yet
TL;DR
Token-based billing is the core problem: Input and output tokens are priced differently (output costs 2-5x more), and agentic AI loops resend entire conversation history as input on every turn, causing costs to skyrocket with volume.
Agent loops are the hidden cost multiplier: AI coding agents and customer support assistants resend large context (like a 20,000-token knowledge base) on every request, turning a simple task into 20 million tokens per day for 1,000 queries, costing $60 just to reread static content.
Prompt caching helps but varies wildly by provider: Anthropic requires manual marking of cache, OpenAI does it automatically, Google has two modes; using it is the main way to reduce API costs, but most businesses on per-seat plans can't access these levers.
Per-seat licensing and pooled usage add another layer of confusion: Claude Team gives individual usage limits, Claude Enterprise pools them, Gemini Enterprise pools by default with clear quotas, and OpenAI works differently; you can have eight different billing behaviors even within one provider.
Labs want to sell 24/7 agentic loops, but can't price them predictably: The vision is a 300-person marketing team run by 15-30 people with autonomous agents burning tokens continuously, but enterprises can't plan for that without cost predictability, and most executives can't even grasp current model capabilities.
No simple solution exists because adoption and literacy are mismatched: Raising per-seat prices to $500/month works for advanced users like SmarterX but fails for most enterprises that haven't seen ROI yet; labs can't market human replacement costs, and the average knowledge worker has no framework to understand token economics.
The Breakdown
AI pricing is a chaotic, developer-focused mess that's leaving enterprises struggling with exploding token costs and no clear path to predictability, forcing companies like RBC and Cisco to report usage jumps of 500% and "pretty crazy" levels, while labs themselves seem to be making it up as they go.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Cheap Models, Hard Tasks
Most agent workflows route every step to the frontier model by default. The bill scales with how chatty the agent gets, even when most steps don't need that brain.

Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.