Back to Podcast Digest
Dwarkesh Clips12m

How 3,800 DNA Sites Proved Us Wrong – David Reich

TL;DR

  • David Reich says ancient DNA has repeatedly overturned his priors — he went into both the Neanderthal and natural-selection work expecting one story, then spent years trying to make the contradictory results “go away” before accepting the data.

  • The old view was that recent human selection was mostly quiet — Reich’s 2015 study found just 12 strong signals from about 200 ancient Europeans and Middle Easterners, and even a much larger 2024 Copenhagen study only found 21, which made the field worry it had hit an asymptote.

  • A 14x data jump changed everything — Reich’s group, led by Ali Akbar, analyzed about 16,000 ancient individuals across 18,000 years, plus modern genomes, creating a roughly 22,000-person dataset focused on Europe and the Middle East.

  • The key methodological shift was to model relatedness first, then test for directional selection on top — across 10 million DNA positions, they asked whether adding a simple constant selection term explained allele-frequency changes better than drift, bottlenecks, and admixture alone.

  • That approach exploded the number of detectable selection signals — instead of a couple dozen hits, they found at least 479 sites they’re 99% confident are real, and about 3,800 sites with better-than-50% confidence.

  • The strongest validation came from UK Biobank trait data — as their selection statistic rose above 5, the enrichment for trait-associated variants increased about fivefold, from a 15% baseline to roughly 60–70%, which Reich says is strong independent evidence the signals are genuine.

The Breakdown

When the data humiliates your intuition

Reich opens on a surprisingly personal note: the main lesson of his career has been being wrong. He says he’s been “almost traumatized” by how often he entered a project with a strong guess, only to watch the data demolish it.

The Neanderthal result he didn’t want to believe

He recalls the pre-ancient-DNA consensus he and others had helped build: non-Africans looked like a simple subset of African variation, with no sign of Neanderthal interbreeding. So when the Neanderthal genome analysis showed non-Africans were more closely related to Neanderthals than Africans were, he assumed it had to be an error and spent years trying to make the signal disappear.

Why the field thought recent natural selection might be mostly quiet

The same prior shaped his work on selection: the expectation was that over the last several hundred thousand years, human evolution had been relatively quiescent. Early ancient-DNA scans seemed to support that — in 2015, using about 200 ancient Europeans and Middle Easterners, Reich’s team found 12 highly differentiated positions, exciting but not exactly a flood.

The disappointing decade after the first big hopes

What really surprised them was that better data didn’t unlock vastly more discoveries. Reich points to a 2024 Copenhagen study with much stronger data that still found only 21 positions, which felt less like progress than evidence they might be stuck near an asymptote.

Ali Akbar’s study: more data, new machinery

Then came the reboot. In work led by Ali Akbar, the group increased the data about 14-fold, adding roughly 10,000 new ancient individuals for a total dataset of about 16,000 ancient people over the last 18,000 years, or about 22,000 people including modern samples, mostly from Europe and the Middle East.

The statistical trick: separate ancestry history from selection

Reich explains the new method in plain language: first predict each DNA position from overall genetic relatedness, which captures bottlenecks, drift, and admixture across the genome. Then ask whether adding a simple assumption — that selection pushed a mutation in one direction over time — predicts the data any better, even if that assumption is admittedly “dumb” and oversimplified.

From a couple dozen hits to hundreds — maybe thousands

That analysis across 10 million DNA positions blew up the old picture. They found many hundreds of sites changing too consistently to be chance; after accounting for nearby correlated signals, Reich says there are at least 479 independently selected positions with 99% confidence, and around 3,800 with more than 50% confidence.

The UK Biobank reality check

Because the result seemed almost too big, the team spent years trying to break it. Their strongest outside check came from genome-wide association studies in the 500,000-person UK Biobank: as their selection statistic increased, the selected sites became more and more enriched for trait-associated variants, rising from a 15% baseline to roughly 60–70% once the statistic passed 5 — a pattern Reich treats as the clearest sign they’re seeing something real.

Share