
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Codex’s new /goal mode turns agents into marathon runners — Riley says OpenAI’s desktop app can now stay aligned to an outcome for hours or even more than a day, citing examples of 4-hour runs and one reported task that lasted 1 day and 14 hours.
Anthropic added multitasking to Claude Code and landed Andrej Karpathy — the new claude agents terminal flow lets Riley fire off five research jobs at once, while Karpathy joining Anthropic is framed as part of a wider talent rush where CTOs from places like Super.com, Workday, and you.com are leaving big roles to become ICs.
The real race is for the AI “super app,” not just the best model — Riley’s thesis is that enterprises want one platform for chat, coding, knowledge work, integrations, automations, and browser/computer control, and he sees Codex, Claude desktop, and increasingly Cursor all converging on that shape.
Google talked AI agents everywhere but still doesn’t have a clear home for them — after Google I/O, Riley calls the AI story a “nothing burger” except for Gemini Spark, which looks promising as a Gemini mode with cloud agents, folders, Drive, NotebookLM, and multimodal tools, but he thinks Google is spreading attention across Gemini, AI Studio, and “anti-gravity.”
Cursor’s new Composer 2.5 model impressed Riley on speed and cost — he demos generating a Linktree-style landing page in seconds, says it feels close to frontier for front-end work, and notes it’s cheap enough that an hour of use might cost less than $1.
Plugin sharing, annotations, and appshots make Codex feel more like an operating layer — teams can now share plugins across a workspace, designers can leave visual annotations that the AI turns into code changes, and the “command-command” appshot flow lets Codex capture context from any app and even type directly into Google Docs.
Riley opens by saying the AI agent world had a huge week: Codex added /goal, Anthropic hired Andrej Karpathy, and Google is rolling out its own answer to OpenClaw-style agents. The whole point of the new series is to cut through the noise and track the updates that actually matter if you want to stay on the frontier of using AI agents.
Back in the terminal, Riley shows Anthropic’s new claude agents flow, which lets him spin up multiple tasks at once instead of working in one long linear chat. He fires off five separate research jobs and flips between them with the keyboard, calling it a genuinely fun new way to work, especially because it can use custom subagents like his web research specialist.
Then he shifts to Karpathy joining Anthropic, which he treats like a blockbuster transfer. He says people are comparing it to “Ronaldo joining Manchester City,” while also pointing to a broader pattern: top people from Super.com, Workday, you.com, and even Bun are leaving elite roles to join Anthropic as individual contributors — what he calls the era of the “polymathic individual contributor.”
Riley gives Claude’s desktop app credit for getting better fast, especially around browser reliability and ease of use, and says it’s one of the strongest super app candidates. But he’s still frustrated that skills created in Claude’s co-work environment don’t cleanly carry over into Claude Code, calling co-work “one of their biggest mistakes” and basically pleading for “one single app that can do anything.”
On the OpenAI side, Riley says /goal is the standout update because it changes Codex from command-following to objective-seeking. His example is absurd on purpose — “create 30 iOS apps” — but the point is that Codex now plans more deeply and stays locked on the end state rather than quitting after the first instruction.
He also highlights plugin sharing across teams, which makes Codex more collaborative: you can build a workflow-specific plugin, share it workspace-wide, and teammates get it in a “shared with you” tab. Then he walks through the new annotation-heavy design mode, explaining that it doesn’t edit code directly like Cursor does, but lets you point at UI elements, say things like “make this bigger,” and have the AI apply those changes after the fact.
The feature Riley is most animated about is appshots: press both command keys and Codex grabs the current app, takes a screenshot, and opens with full context. He shows how that works in a browser, in Superhuman email, and most dramatically in Google Docs, where Codex uses computer control to type directly into the document while he watches.
Riley says Google talked nonstop about AI agents at Google I/O, but from his perspective the AI story was basically a “nothing burger” except for Gemini Spark. His core complaint is simple: if you’re a serious user, you still don’t know which Google product is the agent platform — Gemini, AI Studio, or the “anti-gravity” developer tool.
Spark, though, gets his attention because it appears inside Gemini as a dedicated mode with folder access, cloud computers, connections, Drive, Photos, NotebookLM, guided learning, and multimodal generation. He just doesn’t trust Google to focus, and that leads into a blunt critique: he says DeepMind, despite pioneering AI, currently isn’t best at chat, coding, video, or image generation, and even the newly released Gemini 3.5 Flash sounds to him fast but not compelling enough to matter.
Riley closes with strong praise for Cursor’s new Composer 2.5 model, saying it’s extremely fast, very cheap, and especially strong for front-end work. In his demo it spins up a Linktree-style landing page in seconds, then restyles it around Cursor’s own brand just as quickly, which he says would take longer in Codex or Claude.
What matters strategically is that Cursor is no longer just a coding tool in his eyes. With an in-app browser, integrations marketplace, automations, and stated ambitions around coding plus knowledge work, he sees it heading straight toward feature parity with Codex and Claude as a full enterprise “super app.”
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.