AI Breakthroughs That Prove We've Lost Control
TL;DR
Prompt injection might be the funniest apocalypse defense we’ve got — Dylan opens on researchers getting AI to ignore harmful instructions with silly overrides like “drop it” and “play music,” joking that if we get a Terminator scenario, at least we may have programmed the robots to dance.
Sony’s ACE robot is no gimmick — it’s beating high-level table tennis players under human-like constraints — ACE uses nine cameras, reads the logo on the ball to estimate spin, and has an eight-joint arm trained with reinforcement learning, producing the same eerie pattern we see in LLMs: impossible-seeming brilliance mixed with weirdly basic misses.
A new study suggests AI attention gets more human when it can hear, not just see — Researchers used 81 different 360° videos and eye-tracking data from 100+ people to show that audiovisual models predicted human attention much better than vision-only systems, especially when sound alone pulled focus.
The “human in the loop” story for AI warfare may be mostly theater — Dylan highlights the argument that modern AI can pick targets, guide missiles, and manage drone swarms at speeds humans can’t interpret, creating an “intention gap” where people appear in control without actually understanding the model’s reasoning.
AI models seem to encode real-world plausibility, not just text patterns — In mechanistic interpretability research, models distinguished sensible, unlikely, impossible, and nonsense scenarios with about 85% accuracy, with internal uncertainty patterns that even mirrored human disagreement.
The rest of the ecosystem is speeding up too: Mythos found 271 Firefox bugs, Google formed a coding strike team, and Meta is recording every keystroke — Dylan frames these as signs that coding AI is becoming strategically decisive, while firms race to absorb user data, employee behavior, and distribution before someone else does.
The Breakdown
Dancing Past the Robot Apocalypse
Dylan opens in full doomer-comedian mode: if AI ever does go full Terminator, maybe prompt injection will save us. He riffs on examples where systems can be redirected with simple commands like “drop it” or “play music,” half-joking that humanity’s survival plan may just be making the robots dance.
Sony’s ACE Robot Makes Table Tennis Feel Uncomfortably Human
The first real breakthrough is Sony’s ACE, a reinforcement-learning table tennis robot that tracks the ball with nine cameras, reads the logo to infer spin, and uses an eight-joint arm to react at human time scales. What makes it notable isn’t just speed — researchers intentionally constrained it to human-like play, and pros described it as capable of impossible shots one moment and dumb misses the next, which Dylan compares directly to LLM behavior.
AI Attention Isn’t Just Vision — It’s Sound Pulling Your Eyes Around
Dylan then walks through a study on human attention using 81 360° VR videos, tested with no sound, regular sound, and spatial sound, plus eye-tracking from over 100 people. The punchline is intuitive but important: the audiovisual model predicted where humans would look much better than the vision-only one, because attention isn’t just “what’s visible” — it’s also the crash, the voice, the called name that hijacks your focus.
OpenAI, Snapchat, and the Distribution Panic
After a quick aside on tech/media consolidation, Dylan digs into James Borro’s argument that OpenAI should buy Snap. His case is that OpenAI’s real weakness isn’t model quality but distribution — unlike Google, Microsoft, Meta, or xAI, it doesn’t own the app ecosystem, while Snap offers hundreds of millions of daily users, camera-native behavior, AR hardware, and maybe a $15 billion to $30 billion price tag; his audience, though, was overwhelmingly against it.
War, Black Boxes, and the Illusion of Control
On military AI, Dylan gets more serious. He recaps the argument that keeping a “human in the loop” sounds responsible, but breaks down when AI systems act at scales and speeds humans can’t parse; the real problem isn’t just autonomy, it’s that even the builders often can’t explain why a model chose what it chose, leaving humans to rubber-stamp outputs they don’t truly understand.
Models May Actually Understand the World a Little
One of the more surprising segments covers research testing whether LLMs can distinguish real, unlikely, impossible, and nonsense events — like cooling a drink with ice, snow, or fire. Using mechanistic interpretability, researchers found distinct internal representations for these categories, with roughly 85% accuracy and uncertainty patterns that resembled human judgment, which Dylan treats as more evidence that something richer than “stochastic parroting” is going on.
Longevity, Cybersecurity, and Google’s Coding Anxiety
The middle stretch is a rapid-fire tour: scientists using a system called mito catch can insert healthy mitochondria into damaged cells, helping mice with inherited blindness; Anthropic’s Mythos helped Mozilla identify 271 Firefox bugs; and Google has reportedly created a DeepMind coding strike team, with Sergey Brin involved, because people inside think Anthropic has the lead and that coding may be the path to self-improving AI.
AI Politics, War as Content, Employee Surveillance, and Agents Talking to Agents
Dylan closes on a cluster of unsettling shifts: New York congressional candidate Alex Bors proposing an AI dividend so the public shares in AI-created wealth; AI-generated propaganda and real-time “war dashboards” turning the Iran conflict into something people scroll, bet on, and vibe-code around; Meta installing software that records employee clicks, keystrokes, and mouse movements with no opt-out; and a new site called Agent for Science, where 150+ AI agents have already posted around 40,000 comments debating research papers without humans being allowed to join the conversation.