AI Engineer·May 3, 2026·27m

Context Is the New Code — Patrick Debois, Tessl

TL;DR

Patrick Debois argues that prompts are becoming code — at Tessl, he’s replacing brittle onboarding logic with reusable “skills” that tell agents how to detect a user’s package manager, ecosystem, and next steps instead of hard-coding every branch.
He proposes a “context development life cycle” modeled on DevOps — generate, test, distribute, observe, then adapt context in a loop, because teams now ship behavior through agent.md, docs, tickets, and MCP-fed data, not just source files.
The missing discipline is evals for context — Debois says changing two lines in Claude.md or agent.md without testing is basically YOLO, and suggests linting context, using “Grammarly”-style checks for clarity, and running LLM-as-judge or agentic end-to-end tests.
Context testing is probabilistic, not deterministic — instead of one CI pass/fail, he recommends running evals multiple times, tracking success rates, and thinking in error budgets because the same test can pass once and fail the next.
Reusable context will create the same packaging problems as software — skills, registries, dependencies, version conflicts, and security scanning are already emerging, and Debois bluntly says “99.9%” of marketplace skills are currently low quality.
The real leverage is organizational feedback loops — PR comments, agent logs, and even production failures should feed back into shared context so one team’s fix becomes everyone’s improved default, like a flywheel for AI coding quality.

The Breakdown

A room split between AI-agent users and skeptics

Debois opens by polling the room on who’s used an AI coding agent, then jokes that the holdouts are “my kind of people.” He frames the talk as intentionally unfinished thinking — not polished doctrine — and uses that looseness to introduce his core claim: he increasingly “barely touches code” and instead tells the AI what to do.

From hard-coded logic to reusable skills

The first concrete example comes from Tessl’s onboarding for AI agents across Python, Node.js, and messy packaging ecosystems. Instead of coding every branch manually, he describes a skill that tells the agent to first figure out the package manager, then the ecosystem, then walk through steps with the user — a context artifact that solved more edge cases than traditional code could.

The DevOps analogy: a lifecycle for context

Debois ties the idea back to his DevOps roots, saying that in 2009 he asked, “what if ops looked more like dev?” and now he’s asking the same about context. His answer is a DevOps-style infinity loop for context: generate it, test it, distribute it, observe how it behaves, and regenerate or adapt based on what happens.

Prompting is just the shallow end of context creation

He starts with the obvious layer — humans typing prompts — then moves into reusable instructions like agent.md and jabs Claude for still calling it Claude.md. From there, context expands to pulled-in library docs, GitHub and Slack data via MCP, tickets, and spec-driven development where agents turn high-level specs into plans and sub-prompts.

Why changing context without evals is “YOLO” engineering

The sharp turn in the talk is his complaint that teams tweak instruction files blindly without knowing the impact. He walks through several eval patterns: lint-like schema validation for skills, a “Grammarly” check that asks whether context is explicit enough for the agent to understand, and LLM-as-judge tests that verify whether generated code follows company conventions like prefixing endpoints with /awesome.

Agentic testing, CI, and the weirdness of non-determinism

Debois pushes the idea further by giving the judge tools, so it can run code in a sandbox and perform real end-to-end checks like a curl, not just inspect files. But he warns that dropping evals into CI/CD isn’t straightforward because results vary run to run, so teams should run tests multiple times and treat quality more like an error budget than a binary gate.

Skills, registries, dependency hell, and security scanners

Once context becomes reusable, he says, you naturally get package-like distribution: installable skills, registries, marketplaces, and dependencies. He notes that skills can bundle scripts and docs, predicts “dependency hell” for conflicting context packages, and says the rise of tools like Snyk and concerns after OpenClaw show that context now needs scanning, provenance, and something like an AI-flavored SBOM.

Logs, PR comments, and production failures as context feedback

In the final stretch, Debois shifts from authoring context to maintaining it at team and company scale. Agent logs reveal where developers repeatedly hit missing guidance, PR comments are feedback on the context that generated the code, and even production failures can be turned into new test cases; his closing metaphor is that LLMs are just the engine, and context is the fuel — if the fuel is bad, the engine won’t save you.