Back to Podcast Digest
AI Engineer··19m

Agents on the Canvas in tldraw — Steve Ruiz, tldraw

TL;DR

  • tldraw has quietly become canvas infrastructure for AI products — Steve Ruiz positions tldraw not just as a whiteboard, but as the SDK underneath products like Replit’s agent canvas, Luba AI’s canvas, and parts of Google Stitch’s annotate mode.

  • Make Real was an early glimpse of ‘vibe coding’ before the term existed — back in 2023, tldraw let nontechnical users sketch a UI and turn it into a working HTML prototype, which Steve frames as one of the first breakout projects showing AI could turn drawings into software.

  • Structured canvas output beats pure image generation for many diagram tasks — instead of making pixels, the model emits the same circles, shapes, and text objects a human would, which Steve says avoids messy ambiguities like conflicting notions of Y-axes, left/right, and spatial semantics in vision data.

  • The big shift was moving from one-shot generation to visible agents collaborating on the canvas — after sidebar-style agent loops felt like “handing my keyboard to some other AI,” tldraw introduced on-canvas fairies that show state, act in parallel, coordinate with a leader, and literally work where the user is looking.

  • Multi-agent orchestration became tangible when the agents got bodies — with fairies.tldraw.com, one agent can scout the canvas, elect a leader, create a to-do list, and delegate work to others, making shared state, overlap avoidance, and progress visible instead of hidden in terminals or sidebars.

  • Giving agents real power quickly runs into ‘sharp tools’ safety tradeoffs — Steve describes wrapping tldraw in an Electron desktop app, exposing a local endpoint that executes posted JavaScript, and discovering models are very willing to modify local apps and minified bundles if you let them.

The Breakdown

tldraw as the hidden canvas layer behind AI apps

Steve opens by reminding the room that tldraw is three things at once: a free online whiteboard, a London startup, and an SDK for building other products. He casually name-drops Replit’s agent canvas, Luba AI, and Google Stitch’s annotate mode to make the point: even if you haven’t used tldraw directly, you’ve probably already touched it.

Make Real: sketching software before vibe coding had a name

He revisits Make Real, an early 2023 project that let people draw an interface on a canvas and ask a model to make it interactive. Steve calls it quaint by 2026 standards, but at the time it was a big deal because nontechnical users could make software without staring at code — even if, in classic live-demo fashion, the model mostly ignored his color-change request.

Why text-structured drawing can outperform vision for canvas work

From there he explains the real trick: the model wasn’t generating an image, it was producing structured objects like shapes and labels, the same primitives a human editor uses. That matters because spatial concepts are weirdly inconsistent in training data — his example is the Cartesian Y-axis going up while browser coordinates increase downward — so getting predictable visual behavior out of models took a lot of prompt engineering.

From one-shot drawing to Cursor-style agent loops

The next prototype looked more like a coding agent from 2025: ask for something like the life cycle of a butterfly, let the system iterate, review, and keep going until it thinks it’s done. Steve says this matched the conventions of the era, complete with visible thinking and rejection, but it still felt less like collaboration and more like giving your keyboard to an AI roommate.

Fairies: putting agents directly onto the canvas

So they pulled the agent out of the sidebar and turned it into little on-canvas fairies you can drag around — though they “start freaking out” if held too long. This is the emotional center of the talk: once the agents have bodies, you can see not just what they think, but where they’re acting, how multiple agents work in parallel, and how one can add a hat while another works on the cat’s neck.

Leader election, delegation, and visible multi-agent coordination

When Steve groups Ferris, Helen, and Joan together and asks them to draw more animals, one fairy becomes the leader, scouts the canvas, builds a to-do list, and delegates it. He frames this as tldraw discovering the same hard orchestration questions everyone else was grappling with last year — shared state, blind spots while agents are working, and avoiding collisions — except here you can literally watch it happen.

The desktop app experiment where safety gets real fast

The final jump is the wildest: over the holidays, Steve wrapped tldraw in Electron, opened a local port, and told Claude to execute posted JavaScript — “a terrible idea,” as he repeatedly notes. But on an offline, file-based desktop app, that unlocked bizarrely powerful behavior: generating diagrams from code, editing software to match diagrams, and even inspiring the team to ask Claude to strip podcasts out of Spotify by rewriting the app’s minified bundle.

Sharp tools, local-first apps, and the unfinished future

The last demo wobbles into chaos when the model creates a blinking external HTML widget instead of editing tldraw directly, which Steve reacts to with a delighted, slightly horrified disbelief. He lands on the bigger point anyway: if you want maximum agent agency, local-first and file-based software suddenly stop looking idealistic and start looking necessary — but they are absolutely sharp tools, so “have fun” is half invitation, half warning.