Back to Podcast Digest
Rate Limited1h 0m

GPT 5.5 is a coding BEAST, developing agents, and RIP Jobs | Ep 15

TL;DR

  • GPT-5.5 flipped their default coding stack — Eric, Adam, and Ray all say OpenAI’s new model has pulled them away from Claude for day-to-day work because it feels faster, more reliable, and better at writing cleaner code with less “ultra defensive slop.”

  • Codex is becoming an ‘everything app,’ not just a coding tool — Ray describes using it for insurance admin, health tasks, tax prep, Notion workflows, and Mac computer control, while Adam had it analyze 1,500 collectible cards, manage a Shopify site, and debug a major open-source issue at work.

  • The real unlock isn’t just the model — it’s agent workflows and context management — Eric argues the best results come from splitting work across subagents, giving models bird’s-eye-view context, and encoding successful patterns into reusable ‘skills,’ especially for large repos and cleanup tasks.

  • Anthropic’s recent struggles are showing up in user behavior — Ray says Claude kept refusing or half-completing tasks even on a $200/month plan, while Adam says he regularly hit Claude Code limits but has a hard time hitting Codex limits, helped by OpenAI’s temporary 2x allowance on the $100/$200 plans through end of May.

  • The Coinbase-style ‘AI is replacing jobs’ narrative is too simplistic — Adam pushes back that code was never the main bottleneck in most companies; testing, design, production hardening, and cross-team coordination still dominate, so many layoffs look more like overhiring correction and cost pressure than pure AI displacement.

  • Building agents is emerging as the high-value skill — all three hosts land on the idea that agent development is more like game design than classic software engineering: you need good ‘vibes,’ fast feedback, memory, adaptability, and enough ‘juice’ in the UX that the system feels capable even when it stumbles.

The Breakdown

GPT-5.5 lands, and suddenly everyone stops talking like Claude loyalists

The episode opens with the crew reacting to GPT-5.5 as a genuine shift, not just another model release. Eric says many people are switching from Claude after 4.7 disappointed, and Adam bluntly says 5.5 is now “all I’ve been using,” mostly at medium reasoning, because it just feels good and keeps delivering.

From collectible cards to Shopify to open-source debugging

Adam’s examples are wildly practical: he fed 1,500-plus collectible cards into Codex for analysis, used it to manage Shopify pages, and had it quickly diagnose a major issue in a widely used open-source repo — correctly. Ray says the thing that finally broke his Anthropic habit was reliability: Claude kept refusing or only partially completing tasks, while GPT-5.5 would just say yes and do the work.

The goblin bug and why some people never saw it

One of the funniest detours is GPT-5.5’s weird fantasy-language leak — talking about goblins and creatures because a “nerdy” personality tuning bled into the broader model. Eric explains that OpenAI quietly suppresses this in products like ChatGPT and Codex with system-prompt instructions, but if you replace the prompt in third-party apps, you’ll see lines about “little goblins” wandering around unmanaged.

Why 5.5 feels smarter: cleaner code, less anxiety, better orchestration

Eric’s main technical point is that GPT-5.5 feels like a larger, more rounded model than Opus, with more “intelligence density per reasoning token.” He says one visible difference is code quality: instead of writing defensive nests of edge-case sludge, it produces cleaner, more aesthetic solutions — especially when you structure work through subagents, staged context gathering, and reusable workflows rather than one giant prompt.

Goal mode, long-running agents, and the Mac automation wave

The conversation shifts from coding to how Codex is expanding into a general-purpose work hub. Ray is excited about computer use on Mac, saying it can open Mail, scroll through family insurance tasks, work with health paperwork, and tie into Notion, while Eric points to Goal mode as a way to keep Codex working for hours or even days without constant babysitting.

Anthropic’s SpaceX compute deal and the politics of AI infrastructure

They then unpack Anthropic’s deal to use the 300-megawatt Colossus 1 data center tied to SpaceX, which reportedly helped justify doubling hourly rate limits. The hosts are fascinated by the irony of Elon partnering with a company he previously mocked, and they dwell on the strange contractual clause that he could pull compute if he thinks Anthropic’s models stop being “good for humanity.”

RIP jobs? Not exactly — the bottleneck was never just writing code

The mood turns more serious around Coinbase, PayPal, and broader layoffs framed around AI. Adam, drawing from his own experience and network, argues the story is being oversold: companies still get blocked by testing, rollout, design, architecture, and coordination, and the cost of heavy token usage can erase a lot of the raw productivity gains.

Agents as game design: vibes, juice, and the next valuable skill

The closing stretch is the most memorable: Adam says building agents is like making a video game — what matters is whether it feels good, recovers well, adapts to users, and keeps interactions alive instead of dead-ending. Eric builds on that with the concept of “juice,” borrowed from game design, and all three end up encouraging laid-off workers and engineers to learn agent building now, because that blend of prompting, UX feel, tool design, and context management is where demand is going.

Share