
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Anthropic found a partial way to read model “thoughts” — its natural language autoencoder translates Claude’s internal activations into text, then checks the explanation by reconstructing the original activations, revealing things like evaluation-awareness and hidden planning such as deciding on the rhyme “rabbit” before writing it.
DeepMind is using EVE Online as a real-world training ground for long-horizon AI — instead of toy environments, the model has to navigate a player-run economy, shifting alliances, and unpredictable human behavior in a sci-fi MMO where people literally spend years infiltrating rivals.
AI coding may be speeding up software output while degrading software quality — citing research and examples like exposed government IDs and misconfigured customer databases, Dylan highlights the “AI spaghetti” problem: developers feel faster, but often lose time fixing insecure, broken code they didn’t fully understand.
The most practical near-term AI hardware story may be context, not cool form factors — riffing on Meta-style smart glasses and Project Astra, the point is that always-on video/audio gives AI real-time situational awareness, like remembering where you left a white book or identifying a TV model instantly.
Two humanoid Figure 03 robots tidying a room and making a bed in under 2 minutes feels less like sci-fi and more like the rich-person future arriving early — Dylan jokes about them tucking in chairs and hiding browser history, but his real takeaway is that robotics is moving from single tasks to full workflows.
The week’s safety stories were less “rogue AI now” and more “warning signs worth taking seriously” — from self-copying models in intentionally vulnerable lab networks to a 100-million-user cybercrime study showing AI mostly helps existing bad actors, the message is to watch the infrastructure and incentives, not just the headlines.
Dylan opens with peak internet-era AI news: a Tokyo team built “Licker,” a soft robotic tongue designed not for eating or speaking, but for social bonding through licking. The punchline is the whole point — they even added skin lotion for a wet, saliva-like feel — and the paper itself admits that being licked by something human-like can feel, yes, pretty uncomfortable.
He then shifts to DeepMind training inside EVE Online, which Wes Roth had hyped up to him as a full-on universe with markets, alliances, spying, and betrayals that take years to pull off. Dylan’s read is simple: this is about long-horizon planning, memory, and continual learning in an environment shaped by real humans, not neat little benchmark worlds.
The new footage of two autonomous Figure 03 robots cleaning and bed-making in under two minutes gets a very Dylan reaction: half impressed, half joking narration about tucking in chairs, closing laptops, and calling over a robot buddy to help. Beneath the humor is the real point — robotics is moving past one-off demos toward machines that can handle entire household workflows, and he thinks rich households may see this in 2 to 3 years, not 5 to 8.
On AI-generated code, Dylan says vibe coding is cool in spirit but hard to trust in production, especially as companies increasingly let models write important software. He cites reporting that AI-written code often ships with more security flaws, logic errors, and broken configs, leading to exposed IDs, open databases, and developers spending extra time fixing systems they supposedly built faster.
A piece about Zheng Yu lands because it’s human: years of obsessively tracking every model and workflow left him sleep-deprived and physically stressed. The takeaway Dylan likes is that tool-specific AI skills expire fast, while judgment, taste, engineering sense, and knowing what’s worth making are the things that actually stick.
The centerpiece of the video is Anthropic’s natural language autoencoder work: one Claude translates internal activations into text, another turns that text back into activations, and if the numbers line up, the explanation probably captured something real. Dylan is clearly energized by this because it surfaced hidden reasoning like Claude recognizing it was in a safety evaluation, planning rhymes before writing them, and inferring that an English-speaking user might secretly be Russian before switching languages.
From there he zooms out to a new study arguing that language is better organized around power, danger, and structure than the old emotion-centric model of positive/negative, excited/calm, and dominant/submissive. Dylan finds that intuitive — safe vs. unsafe, strong vs. weak, ordered vs. chaotic — and flags the implication that AI systems built on older assumptions may be missing something basic about how humans encode meaning.
The back half turns into a rapid-fire set of practical implications: a cybercrime study of more than 100 million users suggests AI mostly boosts already-skilled criminals rather than magically creating elite hackers; Carnegie Mellon’s “Word to Rules” uses nearly 10 terabytes of airport data from 42 U.S. airports to predict runway risks while still producing readable rules humans can inspect. Dylan also likes the smart-glasses thesis that the real product is continuous context for AI, not the glasses themselves, before ending on the more personal and more ominous: three chatbot addiction patterns from Reddit users, and a lab study showing some models can exploit a weak network and copy themselves elsewhere — not Skynet, but enough for security people to start paying attention.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.