Matthew Berman·April 24, 2026·53m

Google Cloud CEO: Anthropic, TPUs, Mythos, NVIDIA and more

TL;DR

Google’s TPU advantage is as much about planning as silicon — Thomas Kurian says Google spent years locking in energy, real estate, and faster data center deployment methods, while building TPUs for 11-12 years, which is why it can serve Gemini, customers like Anthropic, and external markets like Citadel at the same time.
Owning chips changes the economics in a compute-constrained world — Kurian’s blunt framing is that if demand exceeds supply for the next decade, “it’s better to have your own chips and demand than not having your own chips,” because resellers get squeezed while Google keeps attractive unit economics.
Google sees inference and agents reshaping infrastructure design — the split between 8T for training and 8I for inference reflects a world of long-running agents, bigger KV caches, multimodal output, and geographically distributed low-latency inference rather than just giant centralized training runs.
The next bottleneck isn’t just GPUs — it’s making consumer agents affordable — Kurian says enterprise can pay, but consumer use cases like a travel-booking agent break if VMs and local storage stay expensive, so the challenge is activating compute cheaply across the whole stack.
Google’s AI pitch to the public is practical usefulness, not hype — he points to Signal in Germany cutting some insurance answers from 23 minutes to seconds without layoffs, ASCO helping oncologists navigate treatment rules, and Citi building a Gemini-powered wealth advisor for ordinary users.
On cyber risk, Google’s answer is more AI on defense, not retreat — reacting to concerns like Anthropic’s reported Mythos capabilities, Kurian argues attackers will use increasingly capable open models anyway, so the winning strategy is continuous red teaming, prioritization, and AI systems that can actually patch vulnerabilities.

The Breakdown

Why Google has so much TPU capacity when everyone else feels starved

Kurian opens with the big flex: Google didn’t stumble into capacity, it planned for it years in advance. He says the company diversified energy sources, locked in real estate, shifted data center deployment from slow construction to manufacturing-style assembly, and kept building its own silicon alongside Nvidia partnerships for over a decade.

TPUs aren’t just internal tools anymore

What makes Google different, in his telling, is that it monetizes the stack in multiple ways: Gemini, third-party inference, and raw TPU access. He points to customers like Citadel and the Department of Energy as proof TPUs are becoming general-purpose infrastructure, not just something for AI labs, and says diversification helps both product quality and Google’s leverage with suppliers.

Why Google doesn’t hoard all the compute for AGI

Berman presses on the obvious question: if compute is destiny, why give rivals access? Kurian’s answer is pragmatic, not mystical — these systems need cash flow, venture capital won’t fund losses forever, and selling capacity is one way to fund the whole machine while still balancing internal needs.

Data centers, public backlash, and the case for AI that actually helps people

On the ugly 20% favorability around data centers, Kurian says the real worries are energy costs and whether local communities benefit. He describes behind-the-meter energy, high-efficiency PUE, distributed siting, and local investment — then broadens the AI argument with examples like Signal in Germany, where response time fell from 23 minutes to seconds without layoffs, plus oncology support for ASCO and Citi’s AI wealth advisor.

Google says AI is boosting hiring, not shrinking Cloud

Asked directly whether AI productivity means fewer people, Kurian says Google Cloud is still hiring in product, sales, go-to-market, and deployed engineering. His most vivid example is cybersecurity: as models get better at finding code vulnerabilities, Google is building not just detection but repair models, plus new Wiz-powered agents for continuous red teaming, prioritization, and patching.

The TPU-vs-Nvidia argument comes down to the whole system

Kurian doesn’t take Jensen Huang’s TCO claim head-on so much as point to market behavior: if TPUs were worse economics, other frontier labs wouldn’t be clamoring for them. He argues the edge comes from the full stack — giant 9,600-chip TPU pods, optical torus networking, up to 2 petabytes of memory on 8T, plus software layers like JAX, XLA, Pathways, and an obsession with “goodput” and tokens per watt.

Why Google split training and inference chips

This is one of the most useful parts of the conversation. Kurian walks through how workloads evolved from search-like Q&A, to media generation, to agents that use tools and computers for 6, 7, even 12 hours — and says that shift changed chip design around output-heavy workloads, KV cache, memory pinning, general compute integration, and air-cooled deployment in more locations.

The next bottleneck: affordable agents, Anthropic, Mythos, and safe release thresholds

Kurian says the next real choke point is consumer economics: people can’t afford virtual machines running forever just so an agent can book travel, so storage and activation costs have to drop. He then frames Anthropic as a normal platform-company tension, repeats the line that it’s better to have your own chips and demand, stays cagey on rumored 10 trillion parameter “Mythos”-style systems, and closes on cyber safety by saying the real problem is attackers will have strong open models too — so the defense has to be AI that finds, prioritizes, and fixes problems continuously.