Back to Podcast Digest
AI Engineer18m

Combine Skills and MCP to Close the Context Gap — Pedro Rodrigues, Supabase

TL;DR

  • Supabase found MCP alone isn’t enough for safe behavior — in Pedro Rodrigues’ SQL view example, Claude with only the MCP server missed Postgres’ security_invoker = true requirement and created a view that could bypass row-level security, while Claude with the Supabase skill got it right.

  • The hard part isn’t context, it’s guidance — Pedro’s core claim is that agents are already capable, but on fast-changing products like Supabase they need explicit instructions for fresh docs, security pitfalls, and preferred workflows rather than just tool access.

  • If critical information can be skipped, the model will skip it — Supabase learned that agents are reluctant to load reference files or multiple documents, so anything truly non-negotiable, like a security checklist, has to live directly in skill.md.

  • Good product skills should be opinionated, not generic — Supabase explicitly tells agents how to manage schema changes: do direct DDL in dev or staging, run the advisor for security/performance issues, fix them, and only then generate a migration.

  • They actually eval’d the markdown itself — Supabase tested six scenarios across four agents from Anthropic and OpenAI in three conditions (baseline, MCP only, MCP plus skill) and says MCP plus skill won on every model using Braintrust test-completeness scoring.

  • Skill distribution is still messy and unresolved — Pedro says the ecosystem still lacks a standard registry or package flow, with approaches like Vercel’s skills package, model-specific plugins, and repo-level .claude or .cursor folders all competing.

The Breakdown

From “MCP vs Skills” to “How We Actually Wrote One”

Pedro opens by saying the old MCP-versus-skills debate has mostly cooled off: they do different jobs. So instead of arguing abstractions, he focuses on the very practical pain of writing Supabase’s own skill — a task he jokes took more effort than his master’s thesis.

Why Agents Still Need Help on Real Products

His framing is simple: agents are already smart enough to do routine work, but they fall apart when product knowledge is new, updated, or full of sharp edges. With Supabase, that meant stale training data, missed security requirements like row-level security, and a stubborn tendency to act confident instead of admitting they need fresh documentation.

What a Skill Actually Is — and the Security Demo That Made the Case

Pedro gives the quick primer: a skill is a folder with front matter, a skill.md, and optional resources that agents progressively discover. Then he shows the concrete Supabase test: Claude Sonnet 4.6 was asked to create a SQL view on top of a table with RLS enabled, and without the skill it failed to add Postgres’ security_invoker = true, which meant the view could expose data it shouldn’t; with the skill, it handled it safely.

Principle 1: Don’t Duplicate Docs, Point to the Source of Truth

One big lesson: don’t turn the skill into a second documentation set that will immediately drift. Instead, the skill should aggressively tell the agent where to find the freshest docs and push it to actually go there — which is why Supabase is also experimenting with exposing documentation over SSH so agents can browse it like a remote filesystem they already understand.

Principle 2: If It Can Be Skipped, It Will Be Skipped

This was one of the most practical parts of the talk. Pedro says agents avoid expensive actions like web fetches and even ignore bundled reference files, especially if solving the task requires opening two or three of them; Supabase originally put its security checklist in a reference file, watched models miss it, and moved it into skill.md so it becomes unavoidable.

Principle 3: Be Opinionated About How Your Product Should Be Used

Pedro’s message here is almost parental: you know your product best, so tell the model how to use it. For Supabase schema work, that means allowing direct DDL changes in dev or staging, using the built-in advisor to catch performance and security issues, fixing those, and only then creating a migration file instead of generating migrations on every tiny schema edit.

They Tested the Document Like Software — and the Skill Won

Supabase ran evals on six scenarios using four agents from Anthropic and OpenAI across three conditions: no MCP/no skill, MCP only, and MCP plus skill. Using Braintrust and a test-completeness score, Pedro says the MCP-plus-skill setup outperformed every other condition on Claude Opus 4.6, Sonnet 4.6, GPT-5.4, and GPT-5.4 mini, which led to his closing line: the issue isn’t context, it’s guidance.

Launch, Open Questions, and the Q&A Reality Check

He ends by announcing the skill and blog post live on stage, then the Q&A surfaces two unresolved edges. One is opportunity — vector search and embeddings could make things like SSH-browsed docs much smarter — and the other is infrastructure: skill distribution still has no clean standard, so today teams are cobbling together Vercel packages, model-specific plugins, and repo-local .claude or .cursor directories.

Share