Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 4;40(3):msad008.
doi: 10.1093/molbev/msad008.

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Affiliations

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Zhangyi He et al. Mol Biol Evol. .

Abstract

Novel technologies for recovering DNA information from archaeological and historical specimens have made available an ever-increasing amount of temporally spaced genetic samples from natural populations. These genetic time series permit the direct assessment of patterns of temporal changes in allele frequencies and hold the promise of improving power for the inference of selection. Increased time resolution can further facilitate testing hypotheses regarding the drivers of past selection events such as the incidence of plant and animal domestication. However, studying past selection processes through ancient DNA (aDNA) still involves considerable obstacles such as postmortem damage, high fragmentation, low coverage, and small samples. To circumvent these challenges, we introduce a novel Bayesian framework for the inference of temporally variable selection based on genotype likelihoods instead of allele frequencies, thereby enabling us to model sample uncertainties resulting from the damage and fragmentation of aDNA molecules. Also, our approach permits the reconstruction of the underlying allele frequency trajectories of the population through time, which allows for a better understanding of the drivers of selection. We evaluate its performance through extensive simulations and demonstrate its utility with an application to the ancient horse samples genotyped at the loci for coat coloration. Our results reveal that incorporating sample uncertainties can further improve the inference of selection.

Keywords: ancient DNA; demographic history; natural selection; particle marginal Metropolis-Hastings; sampling uncertainty; two-layer hidden Markov model.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Graphical representation of our two-layer HMM framework for the data on aDNA sequences, where X denotes the population mutant allele frequency, G denotes the sample individual genotypes, and R denotes the sample individual reads.
<sc>Fig</sc>. 2.
Fig. 2.
Posteriors for the selection coefficients and the underlying frequency trajectory of the mutant allele in the population produced through our method from a data set of genotype likelihoods generated with the selection coefficients s=0.0198 and s+=0.0081. (a) Joint posterior for the selection coefficients s and s+. (b) Marginal posteriors for the selection coefficients s and s+. (c) Posterior for the underlying trajectory of the mutant allele frequency. (d) Posterior for the selection change Δs.
<sc>Fig</sc>. 3.
Fig. 3.
Empirical distributions for the bias in MAP estimates of the selection coefficients across different data qualities and selection scenarios. Data qualities (scenarios AF) are described in table 1, and selection scenarios (scenarios 1–9) are described in table 2. (a)–(i) Boxplots of the bias for scenarios 1–9.
<sc>Fig</sc>. 4.
Fig. 4.
Empirical distributions for the bias in MAP estimates of the selection coefficient across different ranges of the selection coefficient s with the parameters ϕ=0.85 and ψ=1 (i.e., scenario D in table 1).
<sc>Fig</sc>. 5.
Fig. 5.
ROC curves for testing a change in selection across different data qualities and selection scenarios. The AUC value for each curve is summarized. Data qualities (scenarios A–F) are described in table 1, and selection scenarios (scenarios 1–9) are described in table 2. ROC curves for (a) a negative change in selection and (b) a positive change in selection.
<sc>Fig</sc>. 6.
Fig. 6.
ROC curves for testing a change in selection across different ranges of the selection change Δs with the parameters ϕ=0.85 and ψ=1 (i.e., scenario D in table 1). The AUC value for each curve is summarized. ROC curves for (a) a negative change in selection and (b) a positive change in selection.
<sc>Fig</sc>. 7.
Fig. 7.
Posteriors for the selection coefficients of the ASIP mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the ASIP mutation in the population. The samples drawn before 12,500 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients s and s+. (b) Marginal posteriors for the selection coefficients s and s+. (c) Posterior for the underlying frequency trajectory of the ASIP mutation. (d) Posterior for the selection change Δs.
<sc>Fig</sc>. 8.
Fig. 8.
Posteriors for the selection coefficients of the MC1R mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the MC1R mutation in the population. The samples drawn before 4,300 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients s and s+. (b) Marginal posteriors for the selection coefficients s and s+. (c) Posterior for the underlying frequency trajectory of the MC1R mutation. (d) Posterior for the selection change Δs.
<sc>Fig</sc>. 9.
Fig. 9.
Posteriors for the selection coefficients of the KIT13 mutation before and from the Middle Ages (starting from AD 400) and the underlying frequency trajectory of the KIT13 mutation in the population. The samples drawn before 3645 BC are excluded. EMA stands for Early Middle Ages. (a) Joint posterior for the selection coefficients s and s+. (b) Marginal posteriors for the selection coefficients s and s+. (c) Posterior for the underlying frequency trajectory of the KIT13 mutation. (d) Posterior for the selection change Δs.
<sc>Fig</sc>. 10.
Fig. 10.
Posteriors for the selection coefficients of the TRPM1 mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the TRPM1 mutation in the population. The samples drawn before 14,500 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients s and s+. (b) Marginal posteriors for the selection coefficients s and s+. (c) Posterior for the underlying frequency trajectory of the TRPM1 mutation. (d) Posterior for the selection change Δs.

References

    1. Andrieu C, Doucet A, Holenstein R. 2010. Particle Markov chain Monte Carlo methods. J R Stat Soc Ser B. 72:269–342.
    1. Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. 2014. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet. 30:540–546. - PubMed
    1. Bellone RR, Holl H, Setaluri V, Devi S, Maddodi N, Archer S, Sandmeyer L, Ludwig A, Foerster D, Pruvost M, et al. . 2013. Evidence for a retroviral insertion in TRPM1 as the cause of congenital stationary night blindness and leopard complex spotting in the horse. PLoS ONE. 8:e78280. - PMC - PubMed
    1. Bollback JP, York TL, Nielsen R. 2008. Estimation of 2Nes from temporal allele frequency data. Genetics. 179:497–502. - PMC - PubMed
    1. Bosshard L, Dupanloup I, Tenaillon O, Bruggmann R, Ackermann M, Peischl S, Excoffier L. 2017. Accumulation of deleterious mutations during bacterial range expansions. Genetics. 207:669–684. - PMC - PubMed