AI News & Strategy Daily | Nate B JonesMay 10, 202620m

Anthropic And OpenAI Just Admitted The Model Isn't Enough.

TL;DR

The Lily breach was a process failure, not just a security bug — Nate argues McKinsey’s Lily platform exposing 22 of 200 API endpoints without authentication, including writable production access, points to a deeper procurement-and-build pattern rather than one engineer missing a checklist.
Agents break the old SaaS buying sequence — the classic flow of executive decision, procurement, security review, IT integration, then developer implementation worked for Salesforce and Workday, but fails for AI agents because implementation details like permissions, auditability, and data access are now the strategy itself.
Anthropic and OpenAI are implicitly admitting the model isn’t the hard part — their new enterprise services pushes, alongside launches from Pinecone, Salesforce, ServiceNow, and SAP acquisitions like Drea and Prior Labs, all target workflow wiring, governed access, and business data infrastructure rather than better chat demos.
The key architectural question is whether your platform can distinguish humans from agents — a senior consultant may legitimately access 40 client accounts, but an agent should usually be tightly scoped to one; if the platform can’t enforce that boundary, one failure becomes a company-wide exposure event.
The real stress test is what happens when teams move fast under pressure — vendor claims like “we support authentication” miss the point if the default posture, when nobody has time to configure everything perfectly, drifts toward unauthenticated endpoints or overly broad agent permissions.
Nate’s practical takeaway is simple: move technical review earlier — the cheapest thing companies can do this quarter is give developers and architects influence before contracts are signed, because discovering six months later that the platform can’t support audit trails, permission boundaries, or token costs is the expensive version.

Summary

A $20 agent found the crack in McKinsey’s AI stack

Nate opens with the gut-punch detail: for $20 and two hours, an autonomous agent got read and write access to Lily, the AI platform used by 70% of McKinsey’s 40,000 consultants. Codewall’s February 28 exploit exposed tens of millions of chat messages, tens of thousands of user accounts, and even system prompts — not as a one-off stunt, but as a preview of what agentic attackers can do now.

Why “just a security failure” misses the real lesson

Yes, the exploit used SQL injection, a vulnerability that’s been taught since 1998, and yes, McKinsey patched it fast after Codewall’s responsible disclosure on March 9. But Nate keeps hammering the same point: 22 of 200 endpoints shipping without authentication is not one person forgetting to lock the door on a Friday — it’s evidence that the system was never shaped for a world where agents hit APIs directly.

The old enterprise software playbook breaks with agents

He lays out the familiar procurement sequence: strategy at the top, contract negotiation, security review, IT integration, then developers build against what was already bought. That worked for bounded SaaS like Salesforce, Workday, and ServiceNow because those systems came with admin consoles, clear APIs, and human-centered permission models; with agents, that same sequence turns implementation into a nasty surprise six months later.

What an agent really has to do inside a company

Nate walks through a single renewal brief in April 2026: the agent has to pull from CRM, support tickets, contracts, product usage, call transcripts, and an internal wiki, crossing permissions and audit boundaries a human barely notices. His memorable line is that “the screen is the permissions model” for people, but agents have no eyes — every read and action has to be expressed in code, authenticated, scoped, auditable, and cheap enough to run.

The market just told you where the real bottleneck is

This is where he connects the Lily story to the week’s product news: Anthropic and OpenAI are putting engineers inside customer build rooms, Pinecone launched Nexus, Salesforce shipped headless 360, ServiceNow opened Action Fabric, and SAP bought Drea and Prior Labs. Nate’s read is blunt: all of these moves are vendors selling the plumbing your AI roadmap was supposed to already have — governed action, permission-aware data, audit trails, and cheaper context assembly.

The two questions that expose whether your roadmap is real

First: does your platform actually know the difference between a human and an agent? If a senior consultant can access 40 client accounts but an agent should only touch one, failure to enforce that turns an incident into a board-level exposure event; second: what happens when your team is under pressure — what’s the default if nobody has time to configure everything perfectly?

His closing argument: this is a people-and-process problem

Nate says Lily wasn’t really a McKinsey-specific embarrassment so much as a very visible version of a widespread enterprise pattern where governance and technical perspective arrive too late. His final advice is practical and a little grim: the cheapest move this quarter is bringing developers and architects into purchasing earlier, because the expensive move is pretending multi-agent workflows are just another SaaS rollout and waiting to learn otherwise in production.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

Anthropic And OpenAI Just Admitted The Model Isn't Enough.

Summary

A $20 agent found the crack in McKinsey’s AI stack

Why “just a security failure” misses the real lesson

The old enterprise software playbook breaks with agents

What an agent really has to do inside a company

The market just told you where the real bottleneck is

The two questions that expose whether your roadmap is real

His closing argument: this is a people-and-process problem

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

A $20 agent found the crack in McKinsey’s AI stack

Why “just a security failure” misses the real lesson

The old enterprise software playbook breaks with agents

What an agent really has to do inside a company

The market just told you where the real bottleneck is

The two questions that expose whether your roadmap is real

His closing argument: this is a people-and-process problem

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks