Latent SpaceJune 18, 20261h 0m

Why AI Labs With Unlimited GPUs Still Fail — Anjney Midha, AMP

TL;DR

95% node utilization should be table stakes: Midha says single-tenant GPU clusters should run above 95% node utilization and best-in-class MFU should be around 60 to 70%, otherwise the issue is usually leadership and alignment, not hardware scarcity.
Amp wants to be the PJM of compute, not another neo-cloud: He describes Amp as an independent system operator for AI, pooling about 1.3 gigawatts of trusted supply and demand across clouds so labs get guaranteed base load plus flexible spike capacity.
Anthropic's edge was preparation, not luck: Midha says the company spent four years building a culture around efficiency, safety, and a P0 on coding, which is why its October 2024 era breakthrough looked sudden from the outside but was earned internally.
Culture is fragile, not a moat you can buy: His blunt diagnosis for cash-rich labs that still cannot ship state-of-the-art systems is that they skipped hardship, never defined the real P0, and let culture fray because mission-aligned actions stopped matching rhetoric.
Community backlash is becoming a real compute bottleneck: Citing a Stanford talk from General Matter founder Scott Nolan, he says data centers should consider charging something like $4.50 instead of $4 per GPU-hour and returning the extra $0.50 directly to local communities.
His most personal AI bet is end-of-life prediction: Drawing on work with Stanford Med professor Nigam Shah and a 12 million-patient dataset, Midha argues even relatively simple models can improve palliative-care decisions, but regulation still blocks shifting liability from doctors to AI systems.

The Breakdown

Unlimited GPUs and giant budgets are not what make AI labs win. Anjney Midha argues that culture, alignment, and ruthless "output maxing" matter more, then connects that thesis to everything from Anthropic's coding breakthrough to a deeply personal mission around AI for end-of-life care.