Back to Podcast Digest
Mo Bitar··5m

MIT Just Proved AI Can’t Do Your Job

TL;DR

  • Fake citations are already showing up in high-stakes consulting work — Mo opens with Deloitte allegedly delivering a 237-page Australian welfare compliance report that a government researcher found packed with nonexistent citations despite costing $290,000, then points to a separate $1.6 million Canadian healthcare report with similarly fabricated references.

  • MIT’s big takeaway was not “AI replaces workers,” but basically “it’s fine” — after testing major AI models on 11,000 real job tasks with human grading, the study found AI produced only “minimally sufficient” work about 65% of the time and never exceeded 50% on “superior” output.

  • AI performed worst where expertise actually matters — according to the video, legal work, IT, and complex analysis were weak spots, while the strongest results came on low-stakes administrative work like construction paperwork and maintenance logs.

  • The core claim is that current AI is still just autocomplete, not understanding — Mo argues the industry’s AGI story is a bait-and-switch: today’s systems predict plausible next words from patterns, but that is nowhere near a human substitute.

  • Mo’s parrot analogy is the whole argument in one image — a parrot can say “I love you” at the right moment and sound meaningful, but it does not understand love, just as an LLM can sound authoritative without comprehension.

  • His conclusion is practical, not anti-tech — AI won’t replace you, he says, but it can absolutely help with the repetitive work you already hate, which is where current systems actually deliver value.

The Breakdown

The Deloitte Reports That Read Like Academic Fanfiction

Mo starts with a government researcher in Canberra, Australia reading a 237-page welfare compliance report Deloitte was paid $290,000 to produce. The researcher checks one suspicious citation, then another, then another, and realizes they don’t exist; even a quote attributed to a federal judge was apparently fabricated. Mo turns the knife with Deloitte’s response that the fake references didn’t affect the “substantive” findings, joking that this is like fabricating your whole resume but insisting you’re still doing a good job.

Then It Happens Again in Canada

He says this wasn’t a one-off: weeks later, Deloitte allegedly did the same thing in a $1.6 million Canadian healthcare report. A professor in Nova Scotia reportedly found her own name attached to a paper she had never seen, creating that surreal “am I losing my mind?” moment before the answer lands: no, it’s just another invented citation. Mo’s punchline is that you can’t “stand behind” recommendations built on evidence from a parallel universe.

MIT’s 11,000-Task Reality Check

From there he zooms out: this isn’t just a Deloitte problem, because MIT ran what he calls a serious performance review for AI. Researchers had major AI models complete 11,000 tasks from real jobs and then had humans grade the outputs. Mo says the result was underwhelming — about 65% of outputs were merely “minimally sufficient,” and when the bar moved to “superior,” AI never got above 50%.

Where AI Helps vs. Where It Falls Apart

The pattern, he says, is exactly what he’s been arguing on the channel: AI struggles most in skilled work like legal tasks, IT, and complex analysis. It performs better on routine administrative work such as construction paperwork and maintenance logs. His practical framing is simple: AI cannot replace you, but it can help with the tedious stuff you already hate.

“It’s Basically Autocomplete”

Mo then says the part the industry hates hearing out loud: current AI is basically autocomplete. You type words in, it predicts the next words based on patterns from training data, and there’s no thinking or understanding underneath the fluent output. That’s why, in his telling, hallucinated citations and polished nonsense aren’t weird edge cases — they’re a natural consequence of how the systems work.

The AGI Pitch as Misdirection

He’s especially sharp on AGI rhetoric, arguing that companies are asking people to ignore the gap between today’s “very expensive autocomplete” and a true human-level intelligence. His version of the pitch is: yes, the current system commits academic fraud and gets a C-minus from MIT, but someday it’ll be a one-to-one substitute for humans — just give us more money and stop asking questions. The energy here is pure skepticism toward fundraising theater.

The Parrot, the Person, and Why Humans Still Matter

His closing metaphor is the sticky one: the distance from today’s AI to AGI is not iPhone 15 to iPhone 16, it’s parrot to person. A parrot can say “I love you” at the right time and even move you, but it doesn’t understand what it said. Humans are flawed too, he notes, but we can doubt, regret, and realize we were wrong — and that struggle with truth is exactly why computers can speed us up without replacing us.