Back to Podcast Digest
The Artificial Intelligence Show Podcast··18m

The Honest Truth About AI Agents That No One in Your Feed Is Saying

TL;DR

  • The hosts are pro-agent, not blindly all-in — They argue AI agents are clearly part of the future, but most of the internet skips the hard parts: production reliability, governance, security, token budgets, and whether a given agent is actually worth maintaining.

  • Enterprise adoption is way messier than the demos suggest — After Google Next, Paul describes talking with leaders managing genAI rollouts, token budgets, and vendor selection at major companies, and says most firms still struggle with basics like Copilot rollout, GPT training, and workflow analysis.

  • Pricing is the biggest unresolved problem — The episode hammers on how impossible it is to budget when tools burn through credits unpredictably, citing HubSpot credits disappearing three days into a billing cycle and asking how any business can plan around token- or credit-based agent usage.

  • There isn’t just one kind of 'agent' — Mike draws a sharp distinction between guided, in-the-loop tools like Claude Code or Codex and persistent autonomous systems like OpenClaw, arguing that lumping them together hides important tradeoffs around supervision, value, and operational burden.

  • Governance becomes a real problem the second agents are useful — Once a company has 20 agents hooked into connectors, knowledge bases, and subscriptions, someone has to manage what they can access, keep skills updated, and monitor risk, a challenge Paul says most organizations are wildly unprepared for.

  • If agents truly replace labor, pricing may move toward replacement value — Paul says a CEO would gladly pay $3,000 per month for agents doing $300,000 a year of work, which suggests the long-term pricing model may look less like cheap SaaS seats and more like charging against human replacement cost.

The Breakdown

The week that forced the agent conversation

The episode opens with a flood of examples pushing agents into the spotlight: Jason Lemkin of SaaStr publicly sharing how SaaStr uses Artisan for outbound, Qualified for inbound, and Agentforce for reactivation, while Microsoft, OpenAI, and Google all roll out or hype agent capabilities. That sets up the real question the hosts want to answer: not "are agents the future?" but why the practical realities of using them inside actual businesses keep getting glossed over.

Paul’s "this might be it" moment in ChatGPT

Paul says he walked into the office after Google Next, opened ChatGPT, started browsing templates and connections, and immediately thought, "Oh my god, this might be it." The excitement is real, but so is the context: he’d just spent days talking to people inside big companies who are actually responsible for genAI adoption, token budgets, and deciding whether to bet on Anthropic, OpenAI, or Google.

The enterprise reality check: budgets, vendors, and chaos

This is where the podcast gets very grounded. Paul describes leaders who are either in "token maxing" mode or panicking because developers blow through a monthly budget in two days, and he gives a painfully relatable example of opening HubSpot only to find the team had already burned through its credits three days into the billing cycle. The problem isn’t just cost; it’s not even being able to audit where the cost came from in a way that makes sense.

Twenty agents later, who’s actually in charge?

The SaaStr stories are inspiring, but they also reveal the hidden mess: once you have 20 agents with connectors, subscriptions, knowledge bases, and evolving skills, governance becomes its own job. Paul points to a Wiz demo at Google Next on agent risk management as "beautiful" but also a warning sign that most companies are nowhere near ready for the controls required when agents gain access to more and more data.

Building agents is easier than taking them to production

One of the stickiest observations in the episode is that AI now empowers nontraditional builders to make useful things in Replit or similar tools, but not necessarily to run them reliably. Paul says you can build something, launch it, have it break, and then end up asking Claude what failed because you don’t actually know how to support production systems — a very human moment that captures the current gap between prototype and dependable deployment.

Mike splits the agent category in two

Mike argues people talk about "agents" as if it’s one thing, when in practice a guided coding system like Claude Code or Codex is very different from a persistent autonomous setup like OpenClaw. That distinction matters because not using a 24/7 always-on agent doesn’t mean you’re behind; it may just mean your use case is better served by supervised, in-the-loop systems that are easier to control.

The pricing model nobody can explain yet

The strongest thesis in the back half is that current pricing makes no business sense. Mike says he has no idea whether a persistent agent should cost 5 cents or $5,000 a month, and Paul pushes it further: if an agent truly does the work of a customer success hire, SDR, or three employees combined, vendors will eventually charge more like replacement value than $20-per-seat SaaS. His line is blunt and memorable: if a tool created $300,000 a year in value, paying $3,000 a month would feel obvious.

Why "just go use agents" is bad advice for most companies

The episode closes on a sober point: most enterprises haven’t even fully exploited GPTs, standard chat, or connected data sources, let alone autonomous agents. For AI-native startups, going all-in may work; for companies on the AI margin, stuck with legacy systems, governance, talent constraints, and regulation, the smarter question may be whether to wait until the tooling gets simpler instead of overinvesting in architectures that could be obsolete in a year.