Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 6;365(6457):eaat7487.
doi: 10.1126/science.aat7487.

The formation of human populations in South and Central Asia

Vagheesh M Narasimhan  1 Nick Patterson  2   3 Priya Moorjani  4   5 Nadin Rohland  6   7 Rebecca Bernardos  6 Swapan Mallick  6   7   8 Iosif Lazaridis  6 Nathan Nakatsuka  6   9 Iñigo Olalde  6 Mark Lipson  6 Alexander M Kim  6   10 Luca M Olivieri  11 Alfredo Coppa  12 Massimo Vidale  11   13 James Mallory  14 Vyacheslav Moiseyev  15 Egor Kitov  16   17   18 Janet Monge  19 Nicole Adamski  6   8 Neel Alex  20 Nasreen Broomandkhoshbacht  6   8 Francesca Candilio  21   22 Kimberly Callan  6   8 Olivia Cheronet  21   23   24 Brendan J Culleton  25 Matthew Ferry  6   8 Daniel Fernandes  21   23   24   26 Suzanne Freilich  24 Beatriz Gamarra  21   23   27 Daniel Gaudio  21   23 Mateja Hajdinjak  28 Éadaoin Harney  6   8   29 Thomas K Harper  30 Denise Keating  21 Ann Marie Lawson  6   8 Matthew Mah  6   7   8 Kirsten Mandl  24 Megan Michel  6   8 Mario Novak  21   31 Jonas Oppenheimer  6   8 Niraj Rai  32   33 Kendra Sirak  6   21   34 Viviane Slon  28 Kristin Stewardson  6   8 Fatma Zalzala  6   8 Zhao Zhang  6 Gaziz Akhatov  17 Anatoly N Bagashev  35 Alessandra Bagnera  11 Bauryzhan Baitanayev  17 Julio Bendezu-Sarmiento  36 Arman A Bissembaev  17   37 Gian Luca Bonora  38 Temirlan T Chargynov  39 Tatiana Chikisheva  40 Petr K Dashkovskiy  41 Anatoly Derevianko  40 Miroslav Dobeš  42 Katerina Douka  43   44 Nadezhda Dubova  16 Meiram N Duisengali  37 Dmitry Enshin  35 Andrey Epimakhov  45   46 Alexey V Fribus  47 Dorian Fuller  48   49 Alexander Goryachev  35 Andrey Gromov  15 Sergey P Grushin  50 Bryan Hanks  51 Margaret Judd  51 Erlan Kazizov  17 Aleksander Khokhlov  52 Aleksander P Krygin  53 Elena Kupriyanova  54 Pavel Kuznetsov  52 Donata Luiselli  55 Farhod Maksudov  56 Aslan M Mamedov  57 Talgat B Mamirov  17 Christopher Meiklejohn  58 Deborah C Merrett  59 Roberto Micheli  11   60 Oleg Mochalov  52 Samariddin Mustafokulov  56   61 Ayushi Nayak  43 Davide Pettener  62 Richard Potts  63 Dmitry Razhev  35 Marina Rykun  64 Stefania Sarno  62 Tatyana M Savenkova  65 Kulyan Sikhymbaeva  66 Sergey M Slepchenko  35 Oroz A Soltobaev  39 Nadezhda Stepanova  40 Svetlana Svyatko  15   67 Kubatbek Tabaldiev  68 Maria Teschler-Nicola  24   69 Alexey A Tishkin  70 Vitaly V Tkachev  71 Sergey Vasilyev  16   72 Petr Velemínský  73 Dmitriy Voyakin  17   74 Antonina Yermolayeva  17 Muhammad Zahir  43   75 Valery S Zubkov  76 Alisa Zubova  15 Vasant S Shinde  77 Carles Lalueza-Fox  78 Matthias Meyer  28 David Anthony  79 Nicole Boivin  43 Kumarasamy Thangaraj  32 Douglas J Kennett  25   30   80 Michael Frachetti  81   82 Ron Pinhasi  83   24 David Reich  1   7   8   84
Affiliations

The formation of human populations in South and Central Asia

Vagheesh M Narasimhan et al. Science. .

Abstract

By sequencing 523 ancient humans, we show that the primary source of ancestry in modern South Asians is a prehistoric genetic gradient between people related to early hunter-gatherers of Iran and Southeast Asia. After the Indus Valley Civilization's decline, its people mixed with individuals in the southeast to form one of the two main ancestral populations of South Asia, whose direct descendants live in southern India. Simultaneously, they mixed with descendants of Steppe pastoralists who, starting around 4000 years ago, spread via Central Asia to form the other main ancestral population. The Steppe ancestry in South Asia has the same profile as that in Bronze Age Eastern Europe, tracking a movement of people that affected both regions and that likely spread the distinctive features shared between Indo-Iranian and Balto-Slavic languages.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of ancient DNA data.
(A) Distribution of sites and associated archeological or radiocarbon dates along with the number of individuals meeting our analysis thresholds from each site. (B) Locations of ancient individuals for whom we generated ancient DNA that passed our analysis thresholds along with the locations of individuals from 140 groups from present-day South Asia that we analyzed as forming the Modern Indian Cline. Shapes distinguish the individuals from different sites. Data from 106 South Asian groups that do not fit along the Modern Indian Cline as well as AHG are not shown. (C) PCA analysis of ancient and modern individuals projected onto a basis formed by 1,340 present day Eurasians reflects clustering of individuals that mirrors their geographical relationships. An interactive version of this figure is presented in the Online Data Visualizer.
Fig. 2.
Fig. 2.. Outlier analysis reveals ancient contacts between sites.
We plot the average of Principal Component 1 (x-axis) and Principal Component 2 (y-axis) for the West Eurasian and All Eurasian PCA plots, as we found that this aids visual separation of the ancestry profiles. (A) In the Middle to Late Bronze Age Steppe, we observe in addition to the Western_Steppe_MLBA and Central_Steppe_MLBA clusters (indistinguishable in this projection), outliers admixed with other ancestries. The BMAC-related admixture in Kazakhstan documents northward gene flow onto the Steppe and confirms the Inner Asian Mountain Corridor as a conduit for movement of people. (B) At Shahr-i-Sokhta in eastern Iran, there are two primary groupings: one with ~20% Anatolian farmer-related ancestry and no detectable AHG-related ancestry, and the other with ~0% Anatolian farmer-related ancestry and substantial AHG-related ancestry (Indus Periphery Cline). (C) In individuals of the BMAC and successor sites, we observe a main cluster as well as numerous outliers: outliers >2000 BCE with admixture related to WSHG, outliers >2000 BCE on the Indus Periphery Cline (with an ancestral similar similar to the outliers at Shahr-i-Sokhta), and outliers after 2000 BCE that reveal how Central_Steppe_MLBA ancestry had arrived. (D) In the Late Bronze Age and Iron Age of northernmost South Asia, we observe a main cluster consistent with admixture between peoples of the Indus Periphery Cline and Central_Steppe_MLBA, and variable Steppe pastoralist-related admixture.
Fig. 3.
Fig. 3.. Ancestry Transformations in Holocene Eurasia.
(A) Ancestry clines before and after the advent of farming. We document a South Eurasian Early Holocene Cline of increasing Iranian farmer- and West Siberian hunter-gatherer related ancestry moving west-to-east from Anatolia to Iran, and a North Eurasian Early Holocene Cline of increasing relatedness to East Asians moving west-to-east from Europe to Siberia. Mixtures of peoples along these two clines following the spread of farming formed five later gradients (shaded): moving west-to-east: the European Cline, the Caucasus Cline from which the Yamnaya formed, the Central Asian Cline which characterized much of Central Asia in the Copper and Bronze Ages, the Southwest Asian Cline established by spreads of farmers in multiple directions from several loci of domestication, and the Indus Periphery Cline. (B) Following the appearance of the Yamnaya Steppe pastoralists, Western_Steppe_EMBA (Yamnaya-like) ancestry then spread across this vast region. We use arrows to show plausible directions of spread of increasingly diluted ancestry (the arrows are not meant as exact routes which we do not have enough sampling to determine at present). Rough estimates of the timing of the arrival of this ancestry and estimated ancestry proportions are shown.
Fig. 4.
Fig. 4.. The Genomic Formation of South Asia.
(A) The degree of allele sharing with southern Asian hunter-gatherers (AASI) measured by f4(Ethiopia_4500BP, X; Ganj_Dareh_N, AHG) and with Steppe pastoralists measured by f4(Ethiopia_4500BP, X; Central_Steppe_MLBA, Ganj_Dareh_N) reveals three ancestry clines that succeeded each other in time: the Indus Periphery Cline prior to ~2000 BCE, the Steppe Cline represented by northern South Asian individuals after ~2000 BCE, and the Modern Indian Cline. (B) Modeling South Asians as a mixture of Central_Steppe_MLBA, AHG (as a proxy for AASI), and Indus_Periphery_West (the individual from the Indus Periphery Cline with the least AASI ancestry). Groups along the edges of the triangle fit a two-way model, and in the interior only fit a three-way model. The 140 present-day South Asian groups on the Modern Indian Cline are shown as small dots. (C) Groups that traditionally view themselves as being of priestly status in this and the preceding panel are shown in red (“Brahmin,” “Pandit,” and “Bhumihar” but excluding “Catholic Brahmins”), and tend to have a significantly higher ratio of Central_Steppe_MLBA to Indus_Periphery_Cline ancestry than other groups. (D) Plot of the proportion of Central_Steppe_MLBA ancestry on the autosomes (x-axis) and the Y chromosome (y-axis) shows that the source of this ancestry is primarily from females in Late Bronze Age and Iron Age individuals from the Swat District of northernmost South Asia, and primarily from males in most present-day South Asians.
Fig. 5.
Fig. 5.. Admixture Graph Model.
The largest deviation between empirical and theoretical f-statistics is |Z|=2.9, indicating a good fit considering the large number of f-statistics analyzed. Admixture events are shown as dotted lines labeled by proportions, with the minor ancestry in gray. The present-day groups are shown in orange ovals, the ancient ones in blue, and unsampled groups in white. (The ovals and admixture events are positioned according to guesses about their relative dates to help in visualization, although the dates are in no way meant to be exact.) In this graph we do not attempt to model the contribution of WSHG and Anatolian farmer-related ancestry, and thus cannot model Central_Steppe_EMBA, the proximal source of Steppe ancestry in South Asia (instead we model the Steppe ancestry in South Asia through the more distally related Yamnaya). However, the admixture graph does highlight several key findings of the study, including the deep separation of the AASI from other Eurasian lineages, and the fact that some Austroasiatic-speaking groups in South Asia (e.g. Juang) harbor ancestry from a South Asian group with a higher ratio of AASI-related to Iranian farmer-related ancestry than any groups on the Modern Indian Cline, thus revealing that groups with substantial Iranian farmer-related ancestry were not ubiquitous in peninsular South Asia in the 3rd millennium BCE when Austroasiatic languages likely spread across the subcontinent.

Comment in

References

    1. Online Data Visualizer, (available at https://public.tableau.com/views/TheGenomicFormationofSouthandCentralAsi...).
    1. Fuller DQ, Lucas L, in Human Dispersal and Species Movement, Boivin N, Petraglia M, Crassard R, Eds. (Cambridge University Press, Cambridge, 2017), pp. 304–331.
    1. Stevens CJ et al., Between China and South Asia: A Middle Asian corridor of crop dispersal and agricultural innovation in the Bronze Age. The Holocene. 26, 1541–1555 (2016). - PMC - PubMed
    1. Allaby RG, Stevens C, Lucas L, Maeda O, Fuller DQ, Geographic mosaics and changing rates of cereal domestication. Philos. Trans. R. Soc. B Biol. Sci 372, 20160429 (2017). - PMC - PubMed
    1. Dani AH et al., History of Civilizations of Central Asia: The Development of Sedentary and Nomadic Civilizations, 700 B. C. to A (UNESCO Publishing, 1994).

Publication types