Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 1;34(2):296-317.
doi: 10.1093/molbev/msw216.

Signatures of Archaic Adaptive Introgression in Present-Day Human Populations

Affiliations

Signatures of Archaic Adaptive Introgression in Present-Day Human Populations

Fernando Racimo et al. Mol Biol Evol. .

Abstract

Comparisons of DNA from archaic and modern humans show that these groups interbred, and in some cases received an evolutionary advantage from doing so. This process-adaptive introgression-may lead to a faster rate of adaptation than is predicted from models with mutation and selection alone. Within the last couple of years, a series of studies have identified regions of the genome that are likely examples of adaptive introgression. In many cases, once a region was ascertained as being introgressed, commonly used statistics based on both haplotype as well as allele frequency information were employed to test for positive selection. Introgression by itself, however, changes both the haplotype structure and the distribution of allele frequencies, thus confounding traditional tests for detecting positive selection. Therefore, patterns generated by introgression alone may lead to false inferences of positive selection. Here we explore models involving both introgression and positive selection to investigate the behavior of various statistics under adaptive introgression. In particular, we find that the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive introgression. We then examine the 1000 Genomes dataset to characterize the landscape of uniquely shared archaic alleles in human populations. Finally, we identify regions that were likely subject to adaptive introgression and discuss some of the most promising candidate genes located in these regions.

Keywords: adaptive introgression; ancient DNA; denisova; neanderthal.

PubMed Disclaimer

Figures

<sc>Fig.</sc> 1.
Fig. 1.
Schematic illustration of the way the UA,B,C and Q95A,B,C statistics are calculated.
<sc>Fig.</sc> 2.
Fig. 2.
Demographic models described in the main text.
<sc>Fig.</sc> 3.
Fig. 3.
Receiver operating characteristic curves for a scenario of adaptive introgression (s = 0.1) compared with a scenario of neutrality (s = 0), using 1,000 simulations for each case. Populations A and B split from each other 4,000 generations ago, and their ancestral population split from population C 16,000 generations ago. Population sizes were constant and set at 2N=20,000. The admixture event occurred 1,600 generations ago from population C to population B, at rate 2% (top panels) or 25% (bottom panels). The right panels are zoomed-in versions of the left panels.
<sc>Fig.</sc> 4.
Fig. 4.
Joint distribution of Q95A,B,C(w,y) and UA,B,C(w,x,y) for different choices of w (1%, 10%) and x (20%, 50%). We set y to 100% in all cases. 100 individuals were sampled from panel A, 100 from panel B, and 2 from panel C. The demographic parameters were the same as in figure 3.
<sc>Fig.</sc> 5.
Fig. 5.
We computed the number of uniquely shared sites in the autosomes and the X chromosome between particular archaic humans and different choices of present-day non-African panels X (x-axis) from phase 3 of the 1000 Genomes Project. We used a shared frequency cutoff of 0% (A), 20% (B), and 50% (C). Nea-only =  UAfr,X,Nea,Den(1%,20%,100%,0%). Den-only =  UAfr,X,Nea,Den(1%,20%,0%,100%). Nea-all =  UAfr,X,Nea(1%,20%,100%). Den-all =  UAfr,X,Den(1%,20%,100%). Both =  UAfr,X,Nea,Den(1%,20%,100%,100%). Finally, we scaled each of the statistics from panels A to C by the number of segregating sites in each 1000 Genomes population panel, yielding panels D–F.
<sc>Fig.</sc> 6.
Fig. 6.
For each population panel from the 1000 Genomes Project, we jointly plotted the U and Q95 statistics with an archaic frequency cutoff of >20% within each population. Nea-only =  UAfr,X,Nea,Den(1%,20%,100%,0%) and Q95Afr,X,Nea,Den(1%,100%,0%). Den-only =  UAfr,X,Nea,Den(1%,20%,0%,100%) and Q95Afr,X,Nea,Den(1%,0%,100%). Nea-all = UAfr,X,Nea(1%,20%,100%) and Q95Afr,X,Nea(1%,100%). Den-all =  UAfr,X,Den(1%,20%,100%) and Q95Afr,X,Den(1%,100%). Both =  UAfr,X,Nea,Den(1%,20%,100%,100%) and Q95Afr,X,Nea,Den(1%,100%,100%).
<sc>Fig.</sc> 7.
Fig. 7.
We partitioned the genome into non-overlapping windows of 40 kb. Within each window, we computed UOut,EUR,Nea,Den(1%,x,y,z) and UOut,EAS,Nea,Den(1%,x,y,z), where Out = EAS + AFR for EUR as the target introgressed population, and Out = EUR + AFR for EAS as the target introgressed population. We searched for Neanderthal-specific alleles (y=100%,z=0%), Denisovan-specific alleles (y=0%,z=100%) and alleles present in both archaic genomes (y=100%,z=100%) that were uniquely shared with either EUR or EAS at frequencies above different cutoffs (x = 0%, x = 20%, x = 50%, and x = 80%). Windows that fall within the upper tail of the distribution for each modern-archaic population pair are colored in red (P < 0.001/number of pairs tested) and those that do not are colored in blue, except for those in the X chromosome, which are in green. Ovals drawn around multiple points contain multiple windows with uniquely shared alleles that are contiguous. For comparison, the number of high frequency uniquely shared sites between Denisova and Tibetans is also shown (Huerta-Sánchez et al. 2014), although Tibetans are not included in the 1000 Genomes data and the region is 32 kb long, so this may be an underestimate.
<sc>Fig.</sc> 8.
Fig. 8.
We plotted the 40kb regions in the 99.9% highest quantiles of both the Q95Out,Pop,Nea,Den(1%,y,z) and UOut,Pop,Nea,Den(1%,x,y,z) statistics for different choices of target introgressed population (Pop) and outgroup non-introgressed population (Out), and different archaic allele frequency cutoffs within the target population (x). (A) We plotted the extreme regions for continental populations EUR (Out = EAS + AFR), EAS (Out = EUR + AFR), and Eurasians (EUA, Out = AFR), using a target population archaic allele frequency cutoff x of 20%. (B) We plotted the extreme regions from the same statistics as in panel A, but with a more stringent target population archaic allele frequency cutoff x of 50%. (C) We plotted the extreme regions for individual non-African populations within the 1000 Genomes data, using all African populations (excluding African-Americans) as the outgroup, and a cutoff x of 20%. (D) We plotted the extreme regions from the same statistics as in panel C, but with a more stringent target population archaic allele frequency cutoff x of 50%. Nea-only =  UOut,Pop,Nea,Den(1%,x,100%,0%) and Q95Out,Pop,Nea,Den(1%,100%,0%). Den-only =  UOut,Pop,Nea,Den(1%,x,0%,100%) and Q95Out,Pop,Nea,Den(1%,0%,100%). Both =  UOut,Pop,Nea,Den(1%,x,100%,100%) and Q95Out,Pop,Nea,Den(1%,100%,100%).
<sc>Fig.</sc> 9.
Fig. 9.
We explored the haplotype structure of six candidate regions with strong evidence for AI. For each region, we applied a clustering algorithm to the haplotypes of particular human populations and then ordered the clusters by decreasing similarity to the archaic human genome with the larger number of uniquely shared sites (see “Methods” Section). We also plotted the number of differences to the archaic genome for each human haplotype and sorted them simply by decreasing similarity. In the latter case, no clustering was performed, so the rows in the cumulative difference plots do not necessarily correspond to the rows in the adjacent haplotype structure plots. POU2F3: chr11:120120001–120200000. BNC2: chr9:16720001–16760000. LARS: chr5:145480001–145520000. FAP/IFIH1: chr2:163040001–163120000. OAS1: chr12:113360001–113400000. LIPA: chr10:90920001–90980000.

Similar articles

Cited by

References

    1. Akira S, Uematsu S, Takeuchi O. 2006. Pathogen recognition and innate immunity. Cell 124:783–801. - PubMed
    1. Aslanidis C, Ries S, Fehringer P, Büchler C, Klima H, Schmitz G. 1996. Genetic and biochemical evidence that CESD and wolman disease are distinguished by residual lysosomal acid lipase activity. Genomics 33:85–93. - PubMed
    1. Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. 2015. A global reference for human genetic variation. Nature 526:68–74. - PMC - PubMed
    1. Barreiro LB, Ben-Ali M, Quach H, Laval G, Patin E, Pickrell JK, Bouchier C, Tichit M, Neyrolles O, Gicquel B, et al. 2009. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense. PLoS Genet. 5:e1000562. - PMC - PubMed
    1. Barton NH. 1998. The effect of hitch-hiking on neutral genealogies. Genet Res. 72:123–133.

Publication types