. 2016 Oct 13;538(7624):207-214.

doi: 10.1038/nature18299. Epub 2016 Sep 21.

A genomic history of Aboriginal Australia

Anna-Sapfo Malaspinas^{1

2

3}, Michael C Westaway⁴, Craig Muller¹, Vitor C Sousa^{2

3}, Oscar Lao^{5

6}, Isabel Alves^{2

3

7}, Anders Bergström⁸, Georgios Athanasiadis⁹, Jade Y Cheng^{9

10}, Jacob E Crawford^{10

11}, Tim H Heupink⁴, Enrico Macholdt¹², Stephan Peischl^{3

13}, Simon Rasmussen¹⁴, Stephan Schiffels¹⁵, Sankar Subramanian⁴, Joanne L Wright⁴, Anders Albrechtsen¹⁶, Chiara Barbieri^{12

17}, Isabelle Dupanloup^{2

3}, Anders Eriksson^{18

19}, Ashot Margaryan¹, Ida Moltke¹⁶, Irina Pugach¹², Thorfinn S Korneliussen¹, Ivan P Levkivskyi²⁰, J Víctor Moreno-Mayar¹, Shengyu Ni¹², Fernando Racimo¹⁰, Martin Sikora¹, Yali Xue⁸, Farhang A Aghakhanian²¹, Nicolas Brucato²², Søren Brunak²³, Paula F Campos^{1

24}, Warren Clark²⁵, Sturla Ellingvåg²⁶, Gudjugudju Fourmile²⁷, Pascale Gerbault^{28

29}, Darren Injie³⁰, George Koki³¹, Matthew Leavesley³², Betty Logan³³, Aubrey Lynch³⁴, Elizabeth A Matisoo-Smith³⁵, Peter J McAllister³⁶, Alexander J Mentzer³⁷, Mait Metspalu³⁸, Andrea B Migliano²⁹, Les Murgha³⁹, Maude E Phipps²¹, William Pomat³¹, Doc Reynolds⁴⁰, Francois-Xavier Ricaut²², Peter Siba³¹, Mark G Thomas²⁸, Thomas Wales⁴¹, Colleen Ma'run Wall⁴², Stephen J Oppenheimer⁴³, Chris Tyler-Smith⁸, Richard Durbin⁸, Joe Dortch⁴⁴, Andrea Manica¹⁸, Mikkel H Schierup⁹, Robert A Foley^{1

45}, Marta Mirazón Lahr^{1

45}, Claire Bowern⁴⁶, Jeffrey D Wall⁴⁷, Thomas Mailund⁹, Mark Stoneking¹², Rasmus Nielsen^{1

48}, Manjinder S Sandhu⁸, Laurent Excoffier^{2

3}, David M Lambert⁴, Eske Willerslev^{1

8

18}

Affiliations

¹ Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
² Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland.
³ Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
⁴ Research Centre for Human Evolution, Environmental Futures Research Institute, Griffith University, Nathan, Queensland 4111, Australia.
⁵ CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain.
⁶ Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain.
⁷ Population and Conservation Genetics Group, Instituto Gulbenkian de Ciência, 2780-156 Oeiras, Portugal.
⁸ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
⁹ Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.
¹⁰ Department of Integrative Biology, University of California, Berkeley, California 94720, USA.
¹¹ Verily Life Sciences, 2425 Garcia Ave, Mountain View, California 94043, USA.
¹² Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany.
¹³ Interfaculty Bioinformatics Unit University of Bern, Baltzerstrasse 6, CH-3012 Bern, Switzerland.
¹⁴ Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, 2800 Kongens Lyngby, Denmark.
¹⁵ Department for Archaeogenetics, Max Planck Institute for the Science of Human History, Kahlaische Straße 10, D-07745 Jena, Germany.
¹⁶ The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
¹⁷ Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Kahlaische Straße 10, D-07745 Jena, Germany.
¹⁸ Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK.
¹⁹ Integrative Systems Biology Laboratory, Division of Biological and Environmental Sciences &Engineering, King Abdullah University of Science and Technology, 23955-6900 Thuwal, Saudi Arabia.
²⁰ Institute for Theoretical Physics, ETH Zürich, Wolfgang-Pauli-Str. 27, 8093 Zürich, Switzerland.
²¹ Jeffrey Cheah School of Medicine &Health Sciences, Monash University Malaysia, Jalan Lagoon Selatan, Sunway City, 46150 Selangor, Malaysia.
²² Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, 31073 Toulouse, France.
²³ Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
²⁴ CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Rua dos Bragas 289, 4050-123 Porto, Portugal.
²⁵ National Parks and Wildlife, Sturt Highway, Buronga, New South Wales 2739, Australia.
²⁶ Explico Foundation, Vågavegen 16, 6900 Florø, Norway.
²⁷ Giriwandi, Gimuy Yidinji Country, Queensland 4868, Australia.
²⁸ Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.
²⁹ UCL Department of Anthropology, 14 Taviton Street, London WC1H 0BW, UK.
³⁰ Yinhawangka elder, Perth, Western Australia 6062, Australia.
³¹ Papua New Guinea Institute of Medical Research, PO Box 60, Goroka, Papua New Guinea.
³² Archaeology, School of Humanities &Social Sciences, University PO Box 320, University of Papua New Guinea &College of Arts, Society &Education, James Cook University, Cairns, Queensland 4811, Australia.
³³ Ngadju elder, Coolgardie, Western Australia 6429, Australia.
³⁴ Wongatha elder, Kurrawang, Western Australia 6430, Australia.
³⁵ Department of Anatomy, University of Otago, Dunedin 9054, New Zealand.
³⁶ 2209 Springbrook Road, Springbrook, Queensland 4213, Australia.
³⁷ Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
³⁸ Estonian Biocentre, Riia 23b, Tartu 51010, Estonia.
³⁹ 86 Workshop Road, Yarrabah, Queensland 4871, Australia.
⁴⁰ Esperance Nyungar elder, Esperance, Western Australia 6450, Australia.
⁴¹ Atakani Street, Napranum, Queensland 4874, Australia.
⁴² 2 Wynnum North Road, Wynnum, Queensland 4178, Australia.
⁴³ School of Anthropology and Museum Ethnography, Oxford University, Oxford OX2 6PE, UK.
⁴⁴ Centre for Rock Art Research and Management, M257, University of Western Australia, Perth, Western Australia 6009, Australia.
⁴⁵ Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology and Anthropology, University of Cambridge, Fitzwilliam Street, Cambridge CB2 1QH, UK.
⁴⁶ Department of Linguistics, Yale University, 370 Temple Street, New Haven, Connecticut 06520, USA.
⁴⁷ Institute for Human Genetics, University of California, San Francisco, California 94143, USA.
⁴⁸ Departments of Integrative Biology and Statistics, University of California, Berkeley, California 94720, USA.

PMID: 27654914
PMCID: PMC7617037
DOI: 10.1038/nature18299

A genomic history of Aboriginal Australia

Anna-Sapfo Malaspinas et al. Nature. 2016.

. 2016 Oct 13;538(7624):207-214.

doi: 10.1038/nature18299. Epub 2016 Sep 21.

Authors

Affiliations

¹ Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
² Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland.
³ Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
⁴ Research Centre for Human Evolution, Environmental Futures Research Institute, Griffith University, Nathan, Queensland 4111, Australia.
⁵ CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain.
⁶ Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain.
⁷ Population and Conservation Genetics Group, Instituto Gulbenkian de Ciência, 2780-156 Oeiras, Portugal.
⁸ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
⁹ Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.
¹⁰ Department of Integrative Biology, University of California, Berkeley, California 94720, USA.
¹¹ Verily Life Sciences, 2425 Garcia Ave, Mountain View, California 94043, USA.
¹² Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany.
¹³ Interfaculty Bioinformatics Unit University of Bern, Baltzerstrasse 6, CH-3012 Bern, Switzerland.
¹⁴ Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, 2800 Kongens Lyngby, Denmark.
¹⁵ Department for Archaeogenetics, Max Planck Institute for the Science of Human History, Kahlaische Straße 10, D-07745 Jena, Germany.
¹⁶ The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
¹⁷ Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Kahlaische Straße 10, D-07745 Jena, Germany.
¹⁸ Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK.
¹⁹ Integrative Systems Biology Laboratory, Division of Biological and Environmental Sciences &Engineering, King Abdullah University of Science and Technology, 23955-6900 Thuwal, Saudi Arabia.
²⁰ Institute for Theoretical Physics, ETH Zürich, Wolfgang-Pauli-Str. 27, 8093 Zürich, Switzerland.
²¹ Jeffrey Cheah School of Medicine &Health Sciences, Monash University Malaysia, Jalan Lagoon Selatan, Sunway City, 46150 Selangor, Malaysia.
²² Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, 31073 Toulouse, France.
²³ Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
²⁴ CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Rua dos Bragas 289, 4050-123 Porto, Portugal.
²⁵ National Parks and Wildlife, Sturt Highway, Buronga, New South Wales 2739, Australia.
²⁶ Explico Foundation, Vågavegen 16, 6900 Florø, Norway.
²⁷ Giriwandi, Gimuy Yidinji Country, Queensland 4868, Australia.
²⁸ Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.
²⁹ UCL Department of Anthropology, 14 Taviton Street, London WC1H 0BW, UK.
³⁰ Yinhawangka elder, Perth, Western Australia 6062, Australia.
³¹ Papua New Guinea Institute of Medical Research, PO Box 60, Goroka, Papua New Guinea.
³² Archaeology, School of Humanities &Social Sciences, University PO Box 320, University of Papua New Guinea &College of Arts, Society &Education, James Cook University, Cairns, Queensland 4811, Australia.
³³ Ngadju elder, Coolgardie, Western Australia 6429, Australia.
³⁴ Wongatha elder, Kurrawang, Western Australia 6430, Australia.
³⁵ Department of Anatomy, University of Otago, Dunedin 9054, New Zealand.
³⁶ 2209 Springbrook Road, Springbrook, Queensland 4213, Australia.
³⁷ Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
³⁸ Estonian Biocentre, Riia 23b, Tartu 51010, Estonia.
³⁹ 86 Workshop Road, Yarrabah, Queensland 4871, Australia.
⁴⁰ Esperance Nyungar elder, Esperance, Western Australia 6450, Australia.
⁴¹ Atakani Street, Napranum, Queensland 4874, Australia.
⁴² 2 Wynnum North Road, Wynnum, Queensland 4178, Australia.
⁴³ School of Anthropology and Museum Ethnography, Oxford University, Oxford OX2 6PE, UK.
⁴⁴ Centre for Rock Art Research and Management, M257, University of Western Australia, Perth, Western Australia 6009, Australia.
⁴⁵ Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology and Anthropology, University of Cambridge, Fitzwilliam Street, Cambridge CB2 1QH, UK.
⁴⁶ Department of Linguistics, Yale University, 370 Temple Street, New Haven, Connecticut 06520, USA.
⁴⁷ Institute for Human Genetics, University of California, San Francisco, California 94143, USA.
⁴⁸ Departments of Integrative Biology and Statistics, University of California, Berkeley, California 94720, USA.

PMID: 27654914
PMCID: PMC7617037
DOI: 10.1038/nature18299

Abstract

The population history of Aboriginal Australians remains largely uncharacterized. Here we generate high-coverage genomes for 83 Aboriginal Australians (speakers of Pama-Nyungan languages) and 25 Papuans from the New Guinea Highlands. We find that Papuan and Aboriginal Australian ancestors diversified 25-40 thousand years ago (kya), suggesting pre-Holocene population structure in the ancient continent of Sahul (Australia, New Guinea and Tasmania). However, all of the studied Aboriginal Australians descend from a single founding population that differentiated ~10-32 kya. We infer a population expansion in northeast Australia during the Holocene epoch (past 10,000 years) associated with limited gene flow from this region to the rest of Australia, consistent with the spread of the Pama-Nyungan languages. We estimate that Aboriginal Australians and Papuans diverged from Eurasians 51-72 kya, following a single out-of-Africa dispersal, and subsequently admixed with archaic populations. Finally, we report evidence of selection in Aboriginal Australians potentially associated with living in the desert.

PubMed Disclaimer

Figures

**Extended Data Figure 1. Per individual admixture proportions of K=7 ancestral components including Aboriginal Australians, New Guineans, Europeans, Africans, Melanesians and Polynesians.**
The genome of each individual is depicted as a bar and is coloured according to the estimated genome-wide proportions of ancestry components. An unrooted tree showing the relationships between the identified ancestral components is also estimated by our method. Each ancestry has been labelled with the name of the population (see also map) showing the highest fraction of that ancestral component. The cross validation (CV) error is minimised for this value of K for 5-fold CV (S05). The rooted tree supports the shared genetic origin of Aboriginal Australians, Papuans and Bougainvilleans.

**Extended Data Figure 2. Genetic relationships of Aboriginal Australians and Papuans.**
a, Genetic affinities between a Western Central Desert (WCD02) genome and Aboriginal Australians and Papuans. Outgroup f₃ statistics between WCD02 and all other Aboriginal Australians and Highland Papuan individuals that were whole-genome sequenced for this study, using all genotypes called from the sequencing data. Because the widespread recent admixture in Aboriginal Australians has large confounding effects on the f₃ statistics, the values were adjusted using the slope coefficient from a simple linear regression model fitted to the relationship between f₃ and the fraction of non-indigenous (*i.e.*, not Aboriginal Australian nor Papuan) ancestry in each individual genome. The adjusted f₃ statistics display a genetic gradient that separates western and eastern Aboriginal Australian populations. However, we find no differences between Papuan population samples in their level of Aboriginal Australian affinity (Kruskal-Wallis test, p-value = 0.083). Horizontal lines correspond to ±1 standard error. b, Genetic affinities between a Papuan highlander genome and Aboriginal Australians and Papuans. The Papuan highlander sample MAR01 from the Marawaka area was arbitrarily chosen as a reference point for this analysis. f₃ values were adjusted for recent admixture as in (a). All Aboriginal Australian groups display a similar level of Highland Papuan affinity (with the exception of three outlier individuals from the north-eastern WPA and CAI populations: WPA06, WPA05 and CAI10, the latter two of which are known to have at least one parent with origins in Papua New Guinea or the Torres Strait Islands). While some differences between groups are actually statistically significant (Kruskal-Wallis test, p-value = 0.0002, after removing the three outliers), which could be consistent with e.g. low levels of Papuan gene flow into some Aboriginal Australian groups (see S06 and S07), we caution that some of these differences are likely due to imperfect adjustment for Eurasian admixture (the adjusted f₃ is highest in the WCD population, which has the least Eurasian admixture). Horizontal lines correspond to ±1 standard error. c, MSMC analyses. Linear interpolation through the midpoints of the time intervals of the relative cross coalescence rate estimates from MSMC using pairs of individuals including one HGDP-Papuan and one other individual as indicated. We used CAI01, PIL06, WCD01, WON03 and an ECCAC sample for this analysis (see S08 for details). The MSMC results were scaled using a mutation rate of 1.25x10^-8 /generation/site as suggested in and a generation time of 29 years corresponding to the average hunter-gatherer generation interval for males and females.

**Extended Data Figure 3. Introgressed archaic sites and putative Denisovan and Neanderthal haplotypes.**
a, Distribution of per individual number of putative introgressed sites from archaic humans. The number of Neanderthal-specific introgressed sites increases from Europe to Australia, and then decreases in Amerindians, which is consistent with recurrent Neanderthal (or Neanderthal-related archaic) gene flow during the expansion into Eurasia. Our results are thus indicative of several pulses of Neanderthal gene flow into modern humans, as inferred previously^–. We note, however, that the apparent high levels in Neanderthal-specific introgressed sites in Australo-Papuans can be explained by the expected number of misclassified Neanderthal introgressed sites resulting from the shared ancestry with Denisovans (see S10 for details). b, c, d, e, Putative Denisovan (PDH) and Neanderthal haplotypes (PNH). The putative haplotypes correspond to clusters (four or more SNPs spanning at least 4kb) of heterozygous or homozygous genotypes in complete linkage disequilibrium (“diplotypes”) that are potentially the result of Neanderthal or Denisovan admixture. Those diplotypes are homozygous ancestral in 10 Africans, homozygous derived in the Denisovan for the PDH (respectively Neanderthal for the PNH), homozygous ancestral in the Neanderthal for the PDH (respectively Denisovan for the PNH), and with the derived allele segregating in all other contemporary non-African humans (see S11 for details). We report the average number of the PDHs and PNHs (b), the correlation between the estimated amount of Australo-Papuan ancestry (see Figure 2b, Extended Data Figure 1, S05) and the number of identified PDHs for each Australian sample, the sum of the lengths (d) and the average length (d) of the PDHs and PNHs per individual for worldwide populations included in our reference panel (see S03).

**Extended Data Figure 4. Out of Africa: admixture graphs based on D-statistic and MSMC analyses.**
a, Admixture graphs representing some of the topologies considered for the two waves and one wave Out of Africa models assuming Denisovan admixture. All topologies are identical except for the coloured lineages representing Australo-Papuans (green), Neanderthal (Nea, orange) and Denisovan (Den, blue). The graphs differ in (i) the number of OoA events, and (ii) the number of Neanderthal admixture pulses. Png stands for HGDP-Papuan. b, Sum of Squared Errors between the observed D-statistics and the expectations for each quartet in the graph involving the chimpanzee as an outgroup for each of the admixture graphs shown in a and the corresponding four without Denisovan admixture. Each point is the result of the optimization procedure with different starting points. See S09 for details. c, MSMC analyses. Relative cross coalescence rate (CCR) estimates from MSMC for pairs of individuals including one African sample (Yoruba, Dinka and San) and one other sample from Eurasia, as indicated in the legend. d, Simulation study to assess the effect of archaic admixture on the CCR rates. Relative CCR estimated for data simulated under a simple two population divergence model where one of the populations admixed at different rates with an archaic population. See S08 for details.

**Extended Data Figure 5. Inferred deleterious mutations.**
a, Boxplot of the number of derived homozygous sites per individual for worldwide populations that are predicted to be deleterious. Deleteriousness of SNPs was inferred using GERP Rejected Substitution (RS) scores. Derived alleles with a RS score larger than 2 were considered to be deleterious, see S11. Average RS score per individual calculated across heterozygous sites (b), and derived homozygous sites (c). Each coloured symbol corresponds to estimates from a single individual. Homozygosity is calculated as the number of derived homozygous sites divided by the number of sites at which an individual carries at least one copy of the derived allele. Solid lines show the linear regression of homozygosity against average RS score per individual for non-African modern humans. Dashed lines indicate the 95% confidence interval for the linear regression.

**Extended Data Figure 6. Effective population size changes over time.**
a, MSMC analyses. Population size estimates from MSMC for pairs of individuals from several populations within and outside of Australia. For each run, we used two individuals from each population, *i.e.*, four haplotypes in each run. MSMC results were scaled as in Figure 3. b, Bayesian Skyline Plots. Bayesian Skyline Plots (BSP) calculated from the mtDNA genome sequences, showing the effective population size estimates over time when considering either groups from northeastern Australia (CAI, WPA) or groups from southwestern Australia (ENY, NGA, WCD, WON). Solid lines are the estimates, dashed lines are the corresponding 95% credible intervals (see S12).

**Extended Data Figure 7. Genetics mirrors geography and languages.**
a, b. Procrustes analyses of the first two dimensions of a classical multidimensional scaling (MDS) analysis of the Aboriginal Australian genome sequences (autosomes). We considered two cases: an analysis including all variants (a), or only the variants remaining after genomic regions of putative recent European and East Asian (i.e., Han Chinese) origin are “masked” (b, S06). Both MDS plots have been rotated towards the best overlap with geographic sampling locations as defined by Procrustes analysis. In each plot, the arrows indicate the error of the MDS coordinates towards the assigned population sampling geographic coordinates. We find that the genetic relationships within Australia mirrors geography, with a significant correlation for both cases, i.e. r_GEN,GEO = 0.59, p-value < 0.0005 for all variants and even higher (r_GEN,GEO = 0.77, p-value < 0.0005) for the masked data. We find using the Bearing correlogram approach that the main axis of genetic differentiation in the masked Aboriginal Australian genomes is at angle = 65° compared to the equator, i.e., in the southwest to northeast direction (S13). c,d. Correspondence between genetics and linguistics. Unrooted Neighbor-Joining F_ST-based genetic tree (cladogram). Weir and Cockerham F_ST distance was computed between the Aboriginal Australian populations after masking the Eurasian tracts. Statistical robustness of each branch was estimated by means of a bootstrap analysis (1000 replicates, S05). d, Bayesian phylogenetic tree for the 28 different Pama-Nyungan languages represented in this sample (from, see S15). Posterior probabilities are also indicated. Note that one language group can be shared by different Aboriginal Australian groups. The linguistic tree was built with BEAST (50). e,f,g. Gene flow across the continent. e, Mantel non-parametric r (estimating the goodness of fit between genetic differentiation and connectivity) versus ratios of resistance of inland to coastal nodes, showing a peak at 1.7. f, Best fit of pairwise population genetic differentiation, F_ST (computed between the nine Aboriginal Australian groups after masking Eurasian tracts (S06)), versus pairwise connectivity based on the environment (estimated as resistance) when moving inland is 1.7 times harder than moving along coastal nodes. g, Gene flow across the Australian landscape, quantified as the cumulative current for pairwise connections among Aboriginal Australian groups (black circles), with larger current (warmer colours) representing greater gene flow.

**Extended Data Figure 8. European, East Asian and Papuan genomic tracts in Aboriginal Australians.**
a, Distribution of the tracts assigned to Aboriginal Australian (WCD), Papuan, East Asian or European ancestry for 58 unrelated non-WCD Australian samples. Most of the shorter tracts were of Papuan origin, suggesting that a large fraction of the Papuan gene flow is much older than that from Europe and East Asia, consistent with a Papuan influence spreading slowly from northeastern to southwestern Australia by ancient migration. b, Corresponding scatter plot with fitted line of per-individual variance in Papuan tract length vs. geographic distance from WCD, the latter calculated using the Great Circle Distance formula for pairs of individual GPS coordinates. Papuan tract distribution showed a strong and significant correlation with distance from WCD (r = 0.64; p-value < 10^-5), with “younger tracts” closer to New Guinea (i.e., with a larger variance) and “older tracts” closer to WCD” (i.e., with a smaller variance). This is also consistent with continuous Papuan gene flow spreading from the northeast.

**Figure 1. Aboriginal Australian and Papuan samples used in this study, as well as archaeological sites and human remains dated to ~40 kya or older in southern Sunda and Sahul.**
The stars indicate the centroid location for each sampling group (sample size in parentheses). Publicly available genetic data (see S04) used as a reference panel in this study are shown as squares. Sites with dated human remains are shown as white circles and the archaeological sites as black circles. The associated dates can be found in S03. Grey boundaries correspond to territories defined by the language groups provided by the Australian Institute of Aboriginal and Torres Strait Islander Studies. Sampled Aboriginal Australians self-identify primarily as: Yidindji and Gungandji from the Cairns region (CAI, 10, see also S02); Yupangati and Thanakwithi from northwest Cape York (WPA, 6), Wangkangurru and Yarluyandi from the Birdsville region (BDV, 10, 9 sequenced at high depth), Barkindji from southeast (RIV, 8); Pilbara area Yinhawangka and Banjima (PIL, 12), Ngaanyatjarra from western central desert (WCD, 13), Wongatha from WA’s northern Goldfields (WON, 11), Ngadju from WA’s southern Goldfields (NGA, 6); and Nyungar from southwest Australia (ENY, 8). Papuans include samples from the locations Bundi (BUN, 5), Kundiawa (KUN, 5), Mendi (MEN, 5), Marawaka (MAR, 5) and Tari (TAR, 5). We generated SNP array data (black stars) for 45 Papuan samples including 24 Koinambe (KOI) and 15 Kosipe (KOS) - described before - and 6 individuals with Highland ancestry sampled in Port Moresby (PMO). Lake Carpentaria (LC), which covered a significant portion of the land bridge between Australia and New Guinea 11.5-40 kya and thus potentially acted as a barrier to gene flow, is also indicated. Map data were sourced from the Australian Government, http://www.naturalearthdata.com/ and our research.

**Figure 2. Genetic ancestry of Aboriginal Australians in a worldwide context.**
a, Classical Multidimensional scaling (MDS) plot of first two dimensions based on an identity-by-state (IBS) distance matrix (based on 54,971 SNPs) between individuals from this study and worldwide populations, including publicly available data^,,,. The first two dimensions explain 19% of the variance in the IBS distance matrix. Individuals are colour-coded according to sampling location, grouped into Australia (Arnhem Land, ECCAC, BDV, CAI, ENY, NGA, PIL, RIV, WCD, WON, WPA); East Asia (Cambodian, Dai, Han, Japanese, Naxi); Europe (English, French, Sardinian, Scottish, Spanish); India (Vishwabrahmin, Dravidian, Punjabi, Guaharati); and New Guinea (HGDP Papuan, Central Province, Eastern Highlands, Gulf Province, Highlands, PMO, KOI, KOS, BUN, KUN, MEN, TAR, MAR). Stars indicate the centroid for each Aboriginal Australian group. Aboriginal Australians from this study as well as from previous studies are closest to Papuans and also show signals of admixture with Eurasians (see S05 for details). b, Estimation of genomic ancestry proportions for the best number of ancestral components (K=7) based on Aboriginal Australian and Papuan whole genome sequence and SNP array data from this study (see Figure 1), and publicly available SNP array data^,,, (S05). Each ancestry component has been labelled according to the geographic region showing the corresponding highest frequency. The area of each pie chart is proportional to the sample size (as depicted in the legend). The genomes of Aboriginal Australian populations are mostly a mixture of European and Aboriginal Australian ancestry components. Northern Aboriginal Australian groups (Arnhem Land, CAI, ECCAC, PIL and WPA) are also assigned to components mainly present in East Asian populations, while northeastern Aboriginal Australian groups (CAI and WPA) also show components mainly present in New Guinean populations. A background of 5% “Melanesian” component is observed in all the Aboriginal Australian populations; however, this component is widely spread over the geographic area shown in this figure, being present from Taiwan to India. We detected on average 1.5% “Indian” component and 1.4% “Polynesian” component across the Aboriginal Australian samples, but we attribute these residual ancestry components to statistical noise as they are present in other southeast Asian populations and are not supported by other analyses (S05). c, A heat map displaying outgroup f₃ statistics of the form *f3(Mbuti; WCD02, X),* quantifying genetic drift shared between the putatively unadmixed individual WCD02 chosen to represent the Aboriginal Autralian population, and various populations throughout the broader region for which either array genotypes or whole-genome sequencing data were publicly available or generated in this study. We used 760,116 SNPs for which WCD02 had non-missing array genotypes that overlapped with any other datasets. Standard errors as estimated from block jackknife resampling across the genome were in the range 0.00213-0.00713.

**Figure 3. Settlement of Australia.**
Best supported demographic model of the colonisation of Australia and New Guinea. The demographic history of Aboriginal Australian populations was modelled by considering that sampled individuals are from sub-populations (“islands”) that are part of two larger regions (“continents”), which geographically match the northeast (ne) and the southwestern desert (swd) regions of Australia. Maximum likelihood parameter estimates (MLEs) were obtained from the joint site frequency spectrum (SFS) of Han Chinese, HGDP-Papuans, CAI, WPA, WON and WCD. The 95% CI, obtained by non-parametric block bootstrap, are shown within square brackets. Estimated migration rates scaled by the effective population size (2Nm) are shown above/below the corresponding arrows. Only Aboriginal Australian individuals with low European ancestry were included in this analysis. In this model, we estimated parameters specific to the settlement of Australia and New Guinea (numerical values shown in black); keeping all the other demographic parameters set to the point estimates shown in Figure 4 (numerical value shown in grey here). Only admixture events involving proportions >0.5% are shown. The inferred parameters were scaled using a mutation rate of 1.25x10^-8/generation/site and a generation time of 29 years corresponding to the average hunter-gatherer generation interval for males and females. See S07 for further details.

**Figure 4. Out of Africa.**
We used a likelihood-based approach to investigate whether the joint SFS supports the one-wave (1 OoA) or two-waves (2 OoA) scenarios. The maximum likelihood estimates (MLEs) are indicative of which scenario is best supported. As shown on the top left inset, under the 1 OoA scenario we expect (i) the presence of an ancestral bottleneck (in black), (ii) a relatively large Neanderthal admixture pulse shared by the ancestors of all non-Africans, and (iii) overlapping divergence times of the ancestors of Aboriginal Australians and Eurasians. In contrast, the top right inset shows parameters expected under a 2 OoA scenario: (i) a limited/absent ancestral bottleneck (in black) in the ancestors of all non-Africans, (ii) no shared Neanderthal admixture in the ancestors of all non-Africans (iii) distinct divergence times for Aboriginal Australians and Eurasians. The main population tree shows the best fitting topology, which supports the 1 OoA scenario, and maximum likelihood estimates (MLEs) for the divergence and admixture times and the admixture proportions (with 95% CI obtained by non-parametric block bootstrap shown within square brackets). We assume that the OoA event is associated with the ancestral bottleneck. The “Ghost” population represents an unsampled population related to Yoruba that is the source of the out of Africa event(s). Our results suggest that these two African populations split significantly earlier (~125 kya) than the estimated time of dispersals into Eurasia. Note that under a 1 OoA scenario, this “Ghost” population becomes, after the ancestral bottleneck, the ancestral population of all non-Africans that admixed with Neanderthals. Arrow thicknesses are proportional to the intensity of gene flow and the admixture proportions, and that only admixture events involving proportions >0.5% are displayed. The inferred parameters were scaled as for Figure 3. See S07 for further details.

See this image and copyright information in PMC

Comment in

Geneticists attempt to heal rifts with Aboriginal communities.
Callaway E. Callaway E. Nature. 2016 Sep 22;537(7621):457-8. doi: 10.1038/537457a. Nature. 2016. PMID: 27652541 No abstract available.
Population genetics: A map of human wanderlust.
Tucci S, Akey JM. Tucci S, et al. Nature. 2016 Oct 13;538(7624):179-180. doi: 10.1038/nature19472. Epub 2016 Sep 21. Nature. 2016. PMID: 27654916 No abstract available.
Consent: Data-sharing for indigenous peoples.
Kowal E, Llamas B, Tishkoff S. Kowal E, et al. Nature. 2017 Jun 21;546(7659):474. doi: 10.1038/546474a. Nature. 2017. PMID: 28640268 No abstract available.
Data sharing: do scientists know best?
Lambert D, Millar C, Willerslev E. Lambert D, et al. Nature. 2017 Aug 16;548(7667):281. doi: 10.1038/548281b. Nature. 2017. PMID: 28816250 No abstract available.

References

1. Davidson I. The colonization of Australia and its adjacent islands and the evolution of modern cognition. Curr Anthropol. 2010;51:S177–S189.
1. Clarkson C, et al. The archaeology, chronology and stratigraphy of Madjedbebe (Malakunanja II): A site in northern Australia with early occupation. J Hum Evol. 2015;83:46–64. - PubMed
1. O’Connell JF, Allen J. The process, biotic impact, and global implications of the human colonization of Sahul about 47,000 years ago. J Archaeol Sci. 2015;56:73–84.
1. Barker G, et al. The ‘human revolution’in tropical Southeast Asia: the antiquity of anatomically modern humans, and of behavioural modernity, at Niah Cave (Sarawak, Borneo) J Hum Evol. 2007;52:243–261. - PubMed
1. Lahr MM, Foley R. Multiple dispersals and modern human origins. Evol Anthropol Issues News Rev. 1994;3:48–60.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A genomic history of Aboriginal Australia

Affiliations

A genomic history of Aboriginal Australia

Authors

Affiliations

Abstract

Figures

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous