Theo - t3.ggMay 20, 202623m

I'm scared to make this video

TL;DR

Theo says Google’s new Gemini 3.5 Flash looks great on benchmarks but is terrible to actually use — he praises the headline scores like near-300 tokens/sec and strong Terminal Bench results, then argues the model is misleadingly expensive at $1.50/million input and $9/million output tokens while burning far too many tokens in practice.
The pricing story is the real gotcha — Theo compares Gemini 3.5 Flash to Gemini 3 Flash ($0.50 in / $3 out) and old Gemini 2.0 Flash ($0.10 in / $0.40 out), saying Google has effectively pushed users to models that are up to 20x+ more expensive while hiding price from launch materials.
His hands-on coding test was a disaster — in a benchmark where models rebuild his game “Fish Slop,” Gemini 3.5 Flash was the only one that failed, producing broken code, bad assets, and mechanics that didn’t work, while GPT-5.5 handled the task so well he asked it to make the game 3D.
Google replaced a promising open-source CLI with a buggy closed-source one — Theo says Gemini CLI had 100K+ GitHub stars, 6,000 merged PRs, and real community momentum, then Google folded it into the new “Anti-Gravity CLI,” which he found crashy, awkward, and missing basic polish like reliable exit behavior.
The Railway outage is Theo’s proof that Google Cloud itself is untrustworthy — he points to Railway allegedly being blocked by Google Cloud despite spending $2M+ per month, and connects it to prior incidents like Google accidentally deleting Australian pension fund UniSuper’s cloud subscription.
This is also a people-and-politics story inside Google — Theo goes out of his way to praise Dmitri, Jack, and Gal for building trust around Gemini CLI, then says their work was sidelined after Google brought in Windsurf founders for Anti-Gravity, turning a community-driven effort into what he sees as corporate “slop.”

Summary

Theo opens with real fear, not clickbait

Theo starts by saying he’s genuinely scared to publish this because the last time he harshly criticized a Google product, the video got demonetized, suppressed, and manually flagged as “enabling dishonest behavior.” He frames the whole video as a career risk, which gives the rant a different weight: this isn’t just content, it’s him deciding the issue is serious enough to burn goodwill and possibly opportunities.

Gemini 3.5 Flash wins the chart battle

He’s not denying the benchmark story: by the numbers, Gemini 3.5 Flash looks like the best model Google has shipped. He calls out strong performance on Terminal Bench, SWEB Pro, Toolathon, BrowseComp Agent, and MMU Pro, and says Google clearly optimized it for agentic work rather than raw knowledge, with only exceptions like Skatebench where it underperformed Gemini 3.1 Pro.

Then he gets to the part Google didn’t emphasize: cost

Theo says Google’s launch materials conspicuously avoid putting dollar signs next to the performance charts, and he thinks that’s because the economics got much worse. He pegs Gemini 3.5 Flash at $1.50 per million input tokens and $9 per million output tokens, versus Gemini 3 Flash at $0.50 in and $3 out, and old Gemini 2.0 Flash at $0.10 in and $0.40 out — then argues that reasoning-token bloat makes the real cost even uglier.

Fast isn’t helpful if the model wastes tokens and still fails

His core complaint is that Google is selling “speed” while ignoring token efficiency, which he says OpenAI is taking much more seriously. He points to Artificial Analysis output-token comparisons and says Gemini 3.5 Flash ends up among the most expensive modern benchmarked models because it generates so much unnecessary text, so even if tokens stream fast, tasks don’t actually finish faster.

The Fish Slop test is where his patience snaps

Theo uses a practical coding benchmark: asking models to rebuild his old game, Fish Slop, from the original source into a cleaner architecture. Gemini 3.5 Flash was, in his words, the only model that outright failed — broken code, bad glow effects, oversized fish, busted feeding and aging systems, and images so sloppy some didn’t even have transparency; meanwhile GPT-5.5 did so well he escalated the ask and had it turn the game 3D.

Anti-Gravity CLI feels like a downgrade in every way that matters

He then demos the new Anti-Gravity CLI and runs into buggy scrolling, weird input behavior, frozen generation states, awkward sign-in, and even the inability to cleanly quit without typing /exit. What really bothers him is the strategic move behind it: Google is sunsetting support paths for Gemini CLI and Gemini Code Assist under Pro and Ultra plans, while replacing an open-source tool with a closed-source CLI he says is plainly worse.

Gemini CLI had community trust — and Google threw it away

Theo lingers here because this is personal: he says Gemini CLI wasn’t perfect, but it was a real open-source project with 100K+ GitHub stars, thousands of merged PRs, and useful patterns for skills and workflows. He gives unusual praise to three Google employees — Dmitri, Jack, and Gal — for taking feedback seriously, earning trust privately, and delaying this exact video for over a year because they made him believe Google might actually get it right.

Railway going down turns the rant into a broader indictment of Google Cloud

The final act is Railway: Theo says Google Cloud blocked Railway’s account, taking its web-facing layer offline despite Railway allegedly spending more than $2 million a month. He ties that to Google’s history, including the UniSuper incident where an Australian pension fund’s cloud subscription was accidentally deleted, and lands on a bleak thesis: Google has talent, infra, TPUs, and research, but internal politics, churn, and bad incentives keep turning all of that into products he simply doesn’t trust.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

I'm scared to make this video

Summary

Theo opens with real fear, not clickbait

Gemini 3.5 Flash wins the chart battle

Then he gets to the part Google didn’t emphasize: cost

Fast isn’t helpful if the model wastes tokens and still fails

The Fish Slop test is where his patience snaps

Anti-Gravity CLI feels like a downgrade in every way that matters

Gemini CLI had community trust — and Google threw it away

Railway going down turns the rant into a broader indictment of Google Cloud

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

Theo opens with real fear, not clickbait

Gemini 3.5 Flash wins the chart battle

Then he gets to the part Google didn’t emphasize: cost

Fast isn’t helpful if the model wastes tokens and still fails

The Fish Slop test is where his patience snaps

Anti-Gravity CLI feels like a downgrade in every way that matters

Gemini CLI had community trust — and Google threw it away

Railway going down turns the rant into a broader indictment of Google Cloud

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks