Back to Podcast Digest
AI Engineer22m

Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google

TL;DR

  • Raw tool output can break agents: Hablich says early attempts to feed multi-megabyte Chrome trace files with roughly 50,000 lines of JSON into agents failed, so the team switched to markdown and semantic summaries with metrics like LCP and INP.

  • Agents are a separate user segment: Humans and agents may share the same goal, like fixing a broken page, but humans struggle with visual complexity while agents struggle with context windows, tool ambiguity, and memory limits.

  • Measure interface quality with tokens per successful outcome: Google tracks both effectiveness and efficiency, because a low token count is meaningless if the agent cannot actually finish the task, and comparisons only make sense within the same task class.

  • Error messages should help agents self-heal: Adding specific guidance like why navigation back failed, plus proactive tool detours and troubleshooting skills, reduced retries and made the system more resilient without human intervention.

  • More tools is not automatically better: Chrome DevTools for Agents went from one monolithic debug webpage tool to 25 tools, then had to improve schemas and descriptions because a cited paper found 97% of MCP tool descriptions have quality smells.

  • Trust beats convenience in agent UX: Even though users wanted Chrome DevTools autoconnect to remember permission choices, Google kept repeated consent because local agents, CI agents, and internet-facing browsing agents need very different security models.

The Breakdown

Chrome DevTools had to stop treating agents like tiny humans after one trace file blew straight through the context window. Michael Hablich shares four practical lessons from shipping Chrome DevTools for Agents at Google: summarize instead of dumping raw data, measure tokens per successful outcome, design for self-recovery, and keep trust boundaries even when users beg for less friction.

Was This Useful?

Share