
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Open-source AI is a control issue, not a culture-war issue — 0xSero says he has no beef with OpenAI, Anthropic, or xAI, but if your livelihood depends on tools you can't inspect, customize, or keep running when pricing changes, you're exposed.
He thinks GPT-5.4 currently crushes open models on research work — despite liking Kimi, MiniMax, GLM, and Claude for specific tasks, he says GPT-5.4 is in a different league for ArXiv-driven investigation, experiment design, and long-horizon backend work.
The leap from 2022 models to 2025 models is the real shock — he frames it as going from systems that struggled with 'how many Rs are in strawberry' to Gemini Deep Think winning International Math Olympiad gold-level performance in roughly four years.
AI is already automating more work than most people realize — beyond coding copilots, he points to customer support, tagging, synthetic data, browser CRUD work, budgeting, billing, PR review, and repo analysis as tasks he now hands to agents routinely.
The economics look unsustainable at today's subsidized prices — he says his own GPT Pro usage hit 2 billion tokens in 24 hours, estimating it could tie up roughly 8-16 high-end GPUs while he only pays $200/month, which he argues cannot last indefinitely.
His core warning is that automation will hit society before society has a plan — he highlights 25% of Americans relying in some capacity on driving work, the spread of self-driving systems, AI-generated content now allegedly making up 57% of the web, and the danger of letting a few closed labs mediate an essential workforce tool.
0xSero opens by grounding the whole video in personal stakes: eight years in tech, two in AI, five in open source, plus work with meetups and organizations including Anthropic-sponsored education. His thesis is simple and very human — if your life is online and the tools you depend on are opaque, your livelihood is fragile.
Before touching the article, he lays out his priors: LLMs aren't conscious, they're inaccurate statistical machines, and open-source labs are doing solid work even if many models still skew toward coding and agent tasks. Then he gets specific: GPT-5.4 has "blown everything out the water" for research, while Claude is useful for making things work and Kimi, MiniMax, and GLM each have strengths but don't match GPT's research planning and experiment proposal quality.
A big chunk of the video is him showing his actual setup inside Codex: project folders, nested sessions, subagents, and a workflow built to juggle four to eight projects a day without losing the thread. He describes this as the practical reason he's productive — not genius research instincts, but learning how to use these tools well enough that complex work becomes manageable.
He walks through a mixture-of-experts model using Kimi K2.5 as the example: roughly 1.1 trillion parameters total, but only about 22 billion active at inference, which is why these giant systems can still be fast. The live experiment is to train a better second router using his own AI-session data, so the system can predict which experts to activate and fit more intelligence onto smaller amounts of compute without sacrificing quality.
Switching back to the article, he zooms out: AI adoption in companies has exploded, and the capability jump since 2022 feels absurd even to people using the tools every day. His version of the story is memorable — four years ago these systems struggled with toy failures, and now Gemini Deep Think is posting gold-medal-level International Math Olympiad performance and models can do research, code, and complex reasoning at once.
He cites a number that over half the internet — 57% by his telling — is now AI-generated, and says the real number may be higher because plenty of people post AI output under their own names. The sticky part isn't the stat but the social consequence: videos of public figures dying or surviving, fake ragebait about races or nations, and a huge share of the population — especially older or less technical users — simply not being able to tell what's real anymore.
From there, he moves into labor and power: 25% of Americans, he says, live in some capacity off driving, which makes transport automation a community-level economic shock, not just an individual one. He also points to governments using the same kinds of models developers use for coding to make life-and-death decisions, then adds a hardware angle: Nvidia's Vera Rubin stack and Groq LPUs could create a massive U.S. advantage because a rack costs around $6 million and the newest chips are export-restricted.
His closing argument is that we're in a "golden age of compute" where subscriptions feel cheap and intelligence feels abundant, but the math underneath doesn't hold forever. He shows his own dashboards — 42 billion tokens over 30 days, 2 billion in one day — and says if AI becomes the default layer for research, billing, browser workflows, approvals, and agency-style operational work, then open source has to win because society cannot afford to have a foundational workforce tool controlled by a few closed labs that can raise prices, shut features off, or lock everyone in.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.