. 2023 Mar 4;40(3):msad008.

doi: 10.1093/molbev/msad008.

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Zhangyi He^{1

2}, Xiaoyang Dai³, Wenyang Lyu⁴, Mark Beaumont⁵, Feng Yu⁴

Affiliations

¹ Cancer Research UK Beatson Institute, Glasgow, United Kingdom.
² Department of Computer Science, University of Oxford, Oxford, United Kingdom.
³ The Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom.
⁴ School of Mathematics, University of Bristol, Bristol, United Kingdom.
⁵ School of Biological Sciences, University of Bristol, Bristol, United Kingdom.

PMID: 36661852
PMCID: PMC10063216
DOI: 10.1093/molbev/msad008

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Zhangyi He et al. Mol Biol Evol. 2023.

. 2023 Mar 4;40(3):msad008.

doi: 10.1093/molbev/msad008.

Authors

Zhangyi He^{1

2}, Xiaoyang Dai³, Wenyang Lyu⁴, Mark Beaumont⁵, Feng Yu⁴

Affiliations

¹ Cancer Research UK Beatson Institute, Glasgow, United Kingdom.
² Department of Computer Science, University of Oxford, Oxford, United Kingdom.
³ The Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom.
⁴ School of Mathematics, University of Bristol, Bristol, United Kingdom.
⁵ School of Biological Sciences, University of Bristol, Bristol, United Kingdom.

PMID: 36661852
PMCID: PMC10063216
DOI: 10.1093/molbev/msad008

Abstract

Novel technologies for recovering DNA information from archaeological and historical specimens have made available an ever-increasing amount of temporally spaced genetic samples from natural populations. These genetic time series permit the direct assessment of patterns of temporal changes in allele frequencies and hold the promise of improving power for the inference of selection. Increased time resolution can further facilitate testing hypotheses regarding the drivers of past selection events such as the incidence of plant and animal domestication. However, studying past selection processes through ancient DNA (aDNA) still involves considerable obstacles such as postmortem damage, high fragmentation, low coverage, and small samples. To circumvent these challenges, we introduce a novel Bayesian framework for the inference of temporally variable selection based on genotype likelihoods instead of allele frequencies, thereby enabling us to model sample uncertainties resulting from the damage and fragmentation of aDNA molecules. Also, our approach permits the reconstruction of the underlying allele frequency trajectories of the population through time, which allows for a better understanding of the drivers of selection. We evaluate its performance through extensive simulations and demonstrate its utility with an application to the ancient horse samples genotyped at the loci for coat coloration. Our results reveal that incorporating sample uncertainties can further improve the inference of selection.

Keywords: ancient DNA; demographic history; natural selection; particle marginal Metropolis-Hastings; sampling uncertainty; two-layer hidden Markov model.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1. — **Fig. 1.**
Graphical representation of our two-layer HMM framework for the data on aDNA sequences, where $X$ denotes the population mutant allele frequency, $G$ denotes the sample individual genotypes, and $R$ denotes the sample individual reads.

<sc>Fig</sc>. 2. — **Fig. 2.**
Posteriors for the selection coefficients and the underlying frequency trajectory of the mutant allele in the population produced through our method from a data set of genotype likelihoods generated with the selection coefficients $s^{-} = 0.0198$ and $s^{+} = - 0.0081$ . (a) Joint posterior for the selection coefficients $s^{-}$ and $s^{+}$ . (b) Marginal posteriors for the selection coefficients $s^{-}$ and $s^{+}$ . (c) Posterior for the underlying trajectory of the mutant allele frequency. (d) Posterior for the selection change $Δ s$ .

<sc>Fig</sc>. 3. — **Fig. 3.**
Empirical distributions for the bias in MAP estimates of the selection coefficients across different data qualities and selection scenarios. Data qualities (scenarios A–F) are described in table 1, and selection scenarios (scenarios 1–9) are described in table 2. ( $a$ )–( $i$ ) Boxplots of the bias for scenarios 1–9.

<sc>Fig</sc>. 4. — **Fig. 4.**
Empirical distributions for the bias in MAP estimates of the selection coefficient across different ranges of the selection coefficient $s$ with the parameters $ϕ = 0.85$ and $ψ = 1$ (i.e., scenario D in table 1).

<sc>Fig</sc>. 5. — **Fig. 5.**
ROC curves for testing a change in selection across different data qualities and selection scenarios. The AUC value for each curve is summarized. Data qualities (scenarios A–F) are described in table 1, and selection scenarios (scenarios 1–9) are described in table 2. ROC curves for (a) a negative change in selection and (b) a positive change in selection.

<sc>Fig</sc>. 6. — **Fig. 6.**
ROC curves for testing a change in selection across different ranges of the selection change $Δ s$ with the parameters $ϕ = 0.85$ and $ψ = 1$ (i.e., scenario D in table 1). The AUC value for each curve is summarized. ROC curves for (a) a negative change in selection and (b) a positive change in selection.

<sc>Fig</sc>. 7. — **Fig. 7.**
Posteriors for the selection coefficients of the *ASIP* mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the *ASIP* mutation in the population. The samples drawn before 12,500 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients $s^{-}$ and $s^{+}$ . (b) Marginal posteriors for the selection coefficients $s^{-}$ and $s^{+}$ . (c) Posterior for the underlying frequency trajectory of the *ASIP* mutation. (d) Posterior for the selection change $Δ s$ .

<sc>Fig</sc>. 8. — **Fig. 8.**
Posteriors for the selection coefficients of the *MC1R* mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the *MC1R* mutation in the population. The samples drawn before 4,300 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients $s^{-}$ and $s^{+}$ . (b) Marginal posteriors for the selection coefficients $s^{-}$ and $s^{+}$ . (c) Posterior for the underlying frequency trajectory of the *MC1R* mutation. (d) Posterior for the selection change $Δ s$ .

<sc>Fig</sc>. 9. — **Fig. 9.**
Posteriors for the selection coefficients of the *KIT13* mutation before and from the Middle Ages (starting from AD 400) and the underlying frequency trajectory of the *KIT13* mutation in the population. The samples drawn before 3645 BC are excluded. EMA stands for Early Middle Ages. (a) Joint posterior for the selection coefficients $s^{-}$ and $s^{+}$ . (b) Marginal posteriors for the selection coefficients $s^{-}$ and $s^{+}$ . (c) Posterior for the underlying frequency trajectory of the *KIT13* mutation. (d) Posterior for the selection change $Δ s$ .

<sc>Fig</sc>. 10. — **Fig. 10.**
Posteriors for the selection coefficients of the *TRPM1* mutation before and from horse domestication (starting from 3,500 BC) and the underlying frequency trajectory of the *TRPM1* mutation in the population. The samples drawn before 14,500 BC are excluded. DOM stands for domestication. (a) Joint posterior for the selection coefficients $s^{-}$ and $s^{+}$ . (b) Marginal posteriors for the selection coefficients $s^{-}$ and $s^{+}$ . (c) Posterior for the underlying frequency trajectory of the *TRPM1* mutation. (d) Posterior for the selection change $Δ s$ .

See this image and copyright information in PMC

References

1. Andrieu C, Doucet A, Holenstein R. 2010. Particle Markov chain Monte Carlo methods. J R Stat Soc Ser B. 72:269–342.
1. Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. 2014. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet. 30:540–546. - PubMed
1. Bellone RR, Holl H, Setaluri V, Devi S, Maddodi N, Archer S, Sandmeyer L, Ludwig A, Foerster D, Pruvost M, et al. . 2013. Evidence for a retroviral insertion in TRPM1 as the cause of congenital stationary night blindness and leopard complex spotting in the horse. PLoS ONE. 8:e78280. - PMC - PubMed
1. Bollback JP, York TL, Nielsen R. 2008. Estimation of 2 $N_{e} s$ from temporal allele frequency data. Genetics. 179:497–502. - PMC - PubMed
1. Bosshard L, Dupanloup I, Tenaillon O, Bruggmann R, Ackermann M, Peischl S, Excoffier L. 2017. Accumulation of deleterious mutations during bacterial range expansions. Genetics. 207:669–684. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Affiliations

Estimating Temporally Variable Selection Intensity from Ancient DNA Data

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Miscellaneous