
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
CI/CD breaks when agents, not humans, become the unit of work — Madison Faulkner argues today’s GitHub Actions + PR workflow assumed humans submitting one or two diffs a week, but agentic systems generate N PRs across N repos, creating thousands of short-lived branches that are effectively impossible to merge cleanly.
The bottleneck shifts from code generation to validation and merge serialization — Hugo Santos says inference is already fast and getting faster, so the real pain becomes builds, tests, review loops, and the single-ledger nature of Git, which starts looking less like software tooling and more like a high-performance database locking problem.
The replacement is “continuous compute,” where validation happens inside the agent loop — instead of code → PR → CI → review, the new flow is intent/spec → agent harness → internal validation → external validation → pre-merge queue, with builds, tests, and policy checks running continuously rather than as a separate CI phase.
Humans stop reviewing raw code and start approving intent versus result — Santos describes a near-term workflow where reviewers look at “this was the goal, this was the output,” like a working feature video or a security-LLM report, because his own team already sees 4x the old PR volume and says it’s impossible for a person to inspect every diff.
Stateful agent environments become critical infrastructure — both speakers stress that agents can’t keep restarting from scratch; warm caches, persistent memory, hardware/software co-design, ingress shaping, rate limiting, and orchestration are what make retries, fast validation, and large-scale agent loops actually viable.
This shift is happening in weeks to months, not years — Santos points to teams at companies like Fal, Zed, and Ramp already operating this way, and predicts an even weirder “multiverse” future where agents work from multiple possible repo states at once because the main branch is moving too fast to be a stable starting point.
Madison Faulkner, now a partner at NEA after leading data and AI teams at Meta, opens with the blunt thesis: agentic software is breaking traditional CI/CD. She frames the shift as moving from monolithic, LLM-as-one-engine systems into a world of microservices with agents, where the old build-test-deploy stack suddenly looks messy and badly matched to how software is now getting made.
She walks through the normal human workflow: a developer submits one or two diffs, teammates review them, GitHub Actions runs build/test/deploy, and failed tests kick off another round. Then she swaps in agents using the exact same systems, except now there are potentially countless PRs and repos in play, and the whole thing gets chaotic fast — thousands of short-lived branches, cold starts, and conflicting versions all pulling the same codebase in different directions.
Madison points to a chart showing GitHub commit activity and lines added versus deleted exploding in recent months. Her proposed first fix is not ripping everything out immediately, but accelerating the slowest part by layering over existing GitHub Actions and CI infrastructure, with cache becoming the orchestration layer through tight hardware/software co-design.
From there, she outlines a future stack: ingress shaping and rate limiting at intake, a cache layer that routes work to the right infrastructure, and eventually agentic identity plus retries at scale. She reinforces the point with Mitchell Hashimoto’s public ideas about how he’d “fix GitHub,” including evolving it for the cloud era and for inference-heavy workflows, while joking that if platforms don’t serve AI users first, “we die.”
Hugo Santos, CEO of Namespace and former Google microservices lead, picks up the thread by saying this is already how frontier teams work. His key reframing is memorable: up to now, the human was the agent, looping through intent, PR feedback, failing tests, reviewer comments, and merge-queue conflicts — but that loop was tolerable only because human latency hid how slow the machines were.
As code generation gets cheap and continuous, he says the merge moment becomes the real choke point. Git starts to resemble a high-performance database with serialization and a single ledger, where every change needs to commit safely, and the faster machines get, the more painful that lock becomes because the opportunity to merge shrinks.
Santos describes the emerging workflow: start with intent and plan written somewhere like Linear or Slack, feed that into an agent harness like Amp, Cursor, Claude Code, or Factory, and let the agent work from a well-known commit. Validation happens inside the loop — build it, test it, ask the human “continue?” — instead of waiting for a separate CI stage after a pull request exists.
His more radical prediction is close, not distant: external validation will also be done by agents, like a security-focused LLM or API-conformance LLM, and they’ll need stateful environments because restarting from scratch kills speed. Humans come back only at pre-merge, approving intent versus result across semantically grouped changes, and beyond that he imagines a “multiverse” where agents try the same plan against multiple candidate repo states at once because the tip of main is moving too fast to be a stable base.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.