AI EngineerMay 6, 20261h 21m

Skills at Scale — Nick Nisi and Zack Proser, WorkOS

TL;DR

Skills are the portable middle ground between bloated memory files and one-off prompts — Nick Nisi and Zack Proser frame skills as small, composable units of work that can be as little as 30 lines of Markdown, but still travel across repos, teams, Claude, Cursor, Codex, and desktop apps.
The description field does the real routing work — they stress that a skill’s YAML front matter, especially the description, is for the model rather than humans, because it determines when Claude should automatically load the skill for prompts like “roast this repo.”
Scripts turn fuzzy agent behavior into deterministic inputs — using Claude’s command interpolation, they show how a skill can run exact git commands, stale TODO checks, or commit summaries so the model starts from fixed evidence instead of “speculating” about what you meant.
Over-prescription can make a skill worse, not better — Zack and Nick repeatedly argue for constraints over novels of instructions, and Nick says one over-engineered Next.js installer skill actually caused about a 30% drop in performance because Claude was already good at Next.js without the extra dogma.
Confidence scoring and progressive disclosure are their two big performance tricks — they demo skills that load extra rubrics only when needed and use internal confidence thresholds to keep asking clarifying questions until the model reaches roughly 95% certainty before acting.
The real promise is broader than coding — examples range from WorkOS auth install flows and repo analysis to recruiting reports in Claude Desktop, Slack-to-Linear task capture, image/video generation with Nano Banana and Veo, and Remotion-based demo videos built from a single prompt.

Summary

“I think I did a cd recently”: coding with agents as the default

Nick Nisi and Zack Proser open with the now-familiar confession that they barely write code solo anymore — Zack jokes his last manual command was probably cd, and Nick says same. That sets the tone for the whole workshop: if every LLM conversation starts from zero and “Claude never remembers that it ever talked to you,” then the real problem is how to stop repeating yourself all day.

Why memory files help — and where they fall apart

They walk through claude.md, agents.md, and repo-specific memory files as today’s workaround: useful for saying “we use pnpm here,” but always loaded, often ignored, and not portable. Nick nails the failure mode with a line that gets a laugh: sometimes Claude skips a required step and basically says, “Yeah, you told me to do it. I didn’t feel like it” — “that’s how you know it’s a real engineer.”

Skills as DRY for the agent era

Their core pitch is that a skill is a discrete, shareable unit of work: maybe a single Markdown file, maybe a whole folder with scripts, images, and references. Zack’s repo-roast example makes it concrete: a generic agent gives generic repo feedback, but 30 lines of project-specific Markdown can suddenly produce sharp feedback on routing conventions, semantic commits, and README drift.

Anatomy of a skill: front matter, routing, and constraints

They break down skill.md: front matter with name and especially description, which Claude uses for routing at runtime. Both emphasize a subtle design rule: don’t write a novel; write constraints. Instead of prescribing every step, tell the model what must never happen — like “never be vague” or “every claim must cite code lines and git evidence” — and let it reason from there.

Building “repo roast” live, plus a surprisingly deep Q&A

The workshop project is a playful but useful skill called Repo Roast, and attendees are invited to customize it and upload results with a share.sh script. The Q&A gets especially practical: one attendee asks where rules stop and skills begin, and Nick’s answer is basically context economics — keep global memory tiny, push task-specific behavior into skills, then come back later and ask Claude to analyze your week and suggest which skills to split out.

Shared skills, governance headaches, and evals

A longer audience question surfaces the real enterprise pain: what happens when 60 engineers all write their own skills, fork each other’s, and flood a shared library with near-duplicates? Nick and Zack don’t pretend they’ve solved it, but point to WorkOS’s public skills repo, internal marketplaces, versioned plugins, and evals that compare Claude with and without a skill, failing if the skill makes performance worse.

Making skills smarter with scripts, progressive disclosure, and confidence

The second half gets more tactical. They show how script interpolation can fetch deterministic data like the last 10 commits, stale TODOs, or hotspots, saving tokens and preventing tool-call wandering; then they add progressive disclosure so a skill loads a scoring rubric or testing guide only when needed. Nick’s ideation skill demo is the best example: it keeps asking clarifying questions until its confidence score hits 95%+, then writes a contract and phased plan instead of pretending it understands too early.

Beyond code: recruiting, Slack loops, image generation, and animated demos

They end by widening the frame. Skills can power recruiting reports via Claude Desktop connectors, WorkOS’s npx workos install flow through the Claude Agent SDK, and creative workflows like generating images with Nano Banana, animating them with Veo, or building polished product videos with Remotion. The clearest takeaway from the finale is their practical one: save the messy context, especially the frustrating failures, because next week that transcript becomes raw material for the next better skill.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

Skills at Scale — Nick Nisi and Zack Proser, WorkOS

Summary

“I think I did a cd recently”: coding with agents as the default

Why memory files help — and where they fall apart

Skills as DRY for the agent era

Anatomy of a skill: front matter, routing, and constraints

Building “repo roast” live, plus a surprisingly deep Q&A

Shared skills, governance headaches, and evals

Making skills smarter with scripts, progressive disclosure, and confidence

Beyond code: recruiting, Slack loops, image generation, and animated demos

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

“I think I did a cd recently”: coding with agents as the default

Why memory files help — and where they fall apart

Skills as DRY for the agent era

Anatomy of a skill: front matter, routing, and constraints

Building “repo roast” live, plus a surprisingly deep Q&A

Shared skills, governance headaches, and evals

Making skills smarter with scripts, progressive disclosure, and confidence

Beyond code: recruiting, Slack loops, image generation, and animated demos

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks