How geneticists date a 4,000-year-old mixing event
Two ancestral populations met somewhere in South Asia. Almost every Indian alive today is descended from both. Genetics can tell us when — but only by reading the patterns of shuffled DNA in living people.
Swaveda · April 26, 2026
There are two distinct ancestries woven into the genome of almost every Indian alive today. They came from two populations that, until a few thousand years ago, were entirely separate. Geneticists call them ANI — Ancestral North Indians, related to populations in Central Asia, the Middle East, and Europe — and ASI, Ancestral South Indians, with no close relatives outside the subcontinent [S2]. The proportions vary widely across modern groups, but almost every Indian group has some of both.
This is not just a curiosity for population geneticists. The framing of who came from where, and when, runs through some of the most contested questions in Indian history — debates about caste, the Aryan migration, the relationship between archaeology and texts. The genetics doesn't settle those debates. But it constrains them. A 2013 paper by Priya Moorjani and her colleagues set out to answer one of the cleanest questions you can ask about ANI and ASI: when, exactly, did these two ancestries meet?
The mix isn't even across the country. ANI ancestry is highest in the northwest — Punjab, parts of the Hindi belt, parts of Bengal — and falls off going south, lowest in Tamil Nadu and Kerala [S2]. The gradient is real and roughly tracks language families. Indo-European speakers — across Punjab, the Hindi belt, Maharashtra, Bengal — tend to have more ANI. Dravidian speakers — Tamil, Telugu, Kannada, Malayalam — tend to have more ASI. The Northeast is a separate case: Tibeto-Burman speakers there carry East Asian ancestry layered on top, making the picture genuinely different.
The variation is substantial. In some North Indian populations, ANI ancestry can run above 70%. In some South Indian and tribal populations, it can be below 30%. But almost every Indian alive today, regardless of region or language, sits somewhere on this ANI/ASI spectrum.
So when did the mixing happen?
The signal in shuffled DNA
When two distinct populations first interbreed, their offspring inherit long unbroken stretches of DNA from each parent. In the first generation, a chromosome looks roughly like a single long ANI chunk and a single long ASI chunk laid end to end.
Each generation that follows, the two parental chromosomes get shuffled — recombination, the same process that makes you genetically different from your siblings — and the chunks get shorter. Ten generations later, the chunks are noticeably shorter. After a hundred generations, they're very short indeed.
If you can measure the average length of the alternating chunks in a person's genome — a quantity geneticists call linkage disequilibrium decay — you can work backward to estimate how many generations have passed since the mixing began. Longer chunks mean fewer generations. Shorter chunks, more generations.
To convert generations into calendar years, geneticists use a generation length of about 29 years — the average age of a parent at the birth of their child, calibrated from independent demographic studies. A hundred generations is roughly 2,900 years. To make the scale concrete: think about the time between you and your great-grandparents — three or four generations, around 100 years. Now stretch that back twenty-five times.
The math is intricate. There are corrections for population structure, for partial relatedness between source populations, for the way recombination rates vary across the genome. What Moorjani's team built on top of these foundations was a refined method called rolloff [S1], which tightens the confidence intervals and reduces some of the artifacts that earlier LD-dating attempts had struggled with.
What they found
Moorjani's team applied the rolloff method to genome-wide data from 73 distinct groups across the Indian subcontinent — sampling caste, tribal, and regional populations across north and south, urban and rural, Indo-European and Dravidian speakers [S1]. The picture that emerged was strikingly consistent across that diversity.
The earliest mixing they could detect happened about 4,200 years ago. The most recent, about 1,900 years ago. In a subset of groups — including some prominent ones like the Vysya, an Andhra Pradesh trading caste — the entire mixture event fit cleanly inside this window. For those populations, the genetic record begins with two distinct ancestries and ends with one; essentially no ongoing mixing followed.
The dates clustered by language family. If your mother tongue is Hindi, Bengali, Marathi, or another Indo-European language, the mixing in your ancestors averaged about 72 generations ago — roughly 2,000 years. If you grew up speaking Tamil, Telugu, Kannada, Malayalam, or another Dravidian language, the average was older — about 108 generations, roughly 3,100 years. Indo-European populations, on average, finished their mixing about a millennium later than Dravidian populations. Within both groups, tribal and lower-caste communities tended to show older mixing dates than upper-caste communities — suggesting additional waves of admixture in higher-caste populations later in the timeline.
The headline finding of the paper is broader than the dates. India shifted, sometime in this window, from a society where mixing across populations was common to one where mixing even between closely related groups became rare. Geneticists call this transition the rise of endogamy — the practice of marrying within one's group. The endogamous structure that defines so much of South Asian society today, including caste boundaries, began leaving a clear genetic signal during this period [S1].
The dating is solid
Multiple research groups have now run LD-decay analyses on different population samples and gotten broadly consistent answers. The mixing window of 1,900 to 4,200 years ago is, at this point, one of the firmer findings in South Asian population genetics. It's the kind of result that survives independent re-analysis with fresh data. The science is settled on the dating itself.
The bigger questions — where ANI ancestry came from, why these two populations met when they did, what political and cultural changes accompanied the shift to endogamy — those are different questions, and they're being actively investigated. They need different tools: archaeology, linguistic reconstruction, ancient DNA from earlier periods, careful study of textual evidence. Subsequent work, especially Vagheesh Narasimhan and colleagues' 2019 paper genotyping 523 ancient individuals across Central and South Asia [S3], has filled in much of the timeline before the mixing window. The Moorjani dating estimate has held up.
What you carry in your genome, then, is a timestamp. It says when, in the rough centuries after 2000 BCE, your ancestors stopped being two distinct populations and started being one. That's the part the science can answer cleanly. The rest — the why and the how — are open.
Sources cited
- [S1]Genetic Evidence for Recent Population Mixture in India. Priya Moorjani, Kumarasamy Thangaraj, Nick Patterson, et al., 2013, American Journal of Human Genetics 93(3):422-438. (Paper · Tier 1)
- [S2]Reconstructing Indian Population History. David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, Lalji Singh, 2009, Nature 461(7263):489-494. (Paper · Tier 1)
- [S3]The formation of human populations in South and Central Asia. Vagheesh M. Narasimhan, Nick Patterson, Priya Moorjani, et al., 2019, Science 365(6457):eaat7487. (Paper · Tier 1)
Full bibliography: /sources