
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Microsoft’s $190 billion capex still doesn’t buy enough AI capacity — Nate Jones says Satya Nadella’s “capacity constrained” comment means the real bottleneck is below GPUs, in HBM memory, packaging, power, cooling, and data center buildout.
AI vendor contracts now behave like supply contracts, not software licenses — because vendors depend on hyperscaler allocation, buyers should ask for reserved capacity, fallback providers, allocation tiers, and concrete outage plans instead of relying on “best efforts.”
The choke point is not chip design but turning chips into usable systems — Epic AI estimates the top four AI chip designers used about 90% of global chip packaging capacity and 90% of HBM supply in 2025, while consuming only 12% of advanced logic die production.
The real unit of AI infrastructure is the rack-scale module, not the GPU — Jones uses Nvidia’s GB200 NVL72 as the example: 72 Blackwell GPUs, 36 Grace CPUs, 13.5 TB of HBM3, and 576 TB/s of memory bandwidth built for trillion-parameter inference.
Cheaper inference is helping, but it’s also driving more demand — Microsoft improved Copilot inference throughput by 40% in one quarter, yet Jones argues Jevons paradox still rules: better agents and lower costs lead to longer contexts, more retries, and more token consumption.
Executives need to forecast tokens, not seats — Jones warns that a coding assistant, support bot, and autonomous claims agent have radically different token profiles, and says his own usage hit nearly 500 million tokens in a single week.
Jones opens with Microsoft’s April 29 Q3 earnings call, where Satya Nadella said the company will spend $190 billion on capex this year and still be capacity constrained through year end. His point is blunt: if Microsoft can’t buy enough capacity, nobody should still be thinking about AI like ordinary software procurement.
He argues that six months ago, AI vendor contracts looked like SaaS deals, but now they’re effectively supply agreements “in everything but name.” That means allocation, fallback plans, and capacity guarantees matter, because what you’re really buying is inference — “intelligence” served on top of scarce hyperscaler infrastructure.
Jones’s central metaphor is that AI is not a software product with a fancy backend; it’s an industrial production system. A chat response may look lightweight on screen, but behind it sits chips, HBM memory, packaging, networking, power, cooling, land, data center construction, and operations talent — “every word in that paragraph came out of a factory.”
He runs through the scale: Meta raised 2026 spend guidance to $125 billion-$145 billion, Amazon landed more than 2.1 million AI chips in the last year with over half on Trainium, and Google spent $185 billion last year. His takeaway is that these are no longer software companies in any meaningful operating sense; they’re physical infrastructure companies, and that changes everyone downstream.
Using Nvidia’s GB200 NVL72 as the concrete example, Jones explains that the real infrastructure unit is the module: 72 Blackwell GPUs, 36 Grace CPUs, 13.5 TB of HBM3, and 576 TB/s of memory bandwidth. He keeps pulling the camera back to the unglamorous constraints — TSMC CoWoS packaging, substrates, interposers, optics, and especially HBM — because you can have GPUs “on paper” and still not have usable AI accelerators.
The video then gets very physical: IEA projects data center electricity use rising to about 945 terawatt hours by 2030, but Jones says the issue isn’t abstract national power, it’s firm power at the right site on the right schedule. He notes CBRE saying old 12-18 month data center timelines break down for 500-megawatt-plus AI campuses, with transmission and interconnection sometimes stretching toward four years; Meta’s Hyperion campus in Louisiana is one example.
Jones highlights Epic AI’s estimate that the top four chip designers consumed 90% of chip packaging capacity and 90% of HBM supply while using just 12% of advanced logic die production. That’s his case that the bottleneck is system integration, not chip design, and why he thinks the capex story is more industrial constraint than speculative excess — even as CFOs now have to think about 3-5 year GPU depreciation, utilization, and hardware refresh cycles.
He closes with a practical checklist: what share of spend is reserved capacity versus best efforts, what routing plan moves tasks to cheaper models without hurting UX, and where hidden human supervision is masking failure in supposedly autonomous workflows. His final message is pure Nate Jones energy: the old cloud abstraction of elastic compute is broken, your AI strategy is a production line, and buying an AI tool now means buying a share of a factory.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.