Exact Decoding of a Sequentially Markov Coalescent Model in Genetics

Caleb Ki¹, Jonathan Terhorst¹

Affiliations

PMID: 39323740
PMCID: PMC11421421
DOI: 10.1080/01621459.2023.2252570

Exact Decoding of a Sequentially Markov Coalescent Model in Genetics

Caleb Ki et al. J Am Stat Assoc. 2024.

. 2024;119(547):2242-2255.

doi: 10.1080/01621459.2023.2252570. Epub 2023 Oct 3.

Authors

Caleb Ki¹, Jonathan Terhorst¹

Affiliation

¹ Department of Statistics, University of Michigan.

PMID: 39323740
PMCID: PMC11421421
DOI: 10.1080/01621459.2023.2252570

Abstract

In statistical genetics, the sequentially Markov coalescent (SMC) is an important family of models for approximating the distribution of genetic variation data under complex evolutionary models. Methods based on SMC are widely used in genetics and evolutionary biology, with significant applications to genotype phasing and imputation, recombination rate estimation, and inferring population history. SMC allows for likelihood-based inference using hidden Markov models (HMMs), where the latent variable represents a genealogy. Because genealogies are continuous, while HMMs are discrete, SMC requires discretizing the space of trees in a way that is awkward and creates bias. In this work, we propose a method that circumvents this requirement, enabling SMC-based inference to be performed in the natural setting of a continuous state space. We derive fast, exact procedures for frequentist and Bayesian inference using SMC. Compared to existing methods, ours requires minimal user intervention or parameter tuning, no numerical optimization or E-M, and is faster and more accurate.

Keywords: changepoint; coalescent; hidden Markov model; population genetics.

PubMed Disclaimer

Figures

**Fig. 1**
Comparison of XSMC, PSMC, SMCSMC, SMC++ on various simulated size histories.

**Fig. 2**
Result of fitting XSMC to 1000 Genomes data. For each superpopulation, 20 samples were chosen. Solid line denotes the median across all samples, and shaded bands denote the interquartile range.

See this image and copyright information in PMC

Cited by

The solution surface of the Li-Stephens haplotype copying model.
Jin Y, Terhorst J. Jin Y, et al. Algorithms Mol Biol. 2023 Aug 9;18(1):12. doi: 10.1186/s13015-023-00237-z. Algorithms Mol Biol. 2023. PMID: 37559098 Free PMC article.
Accelerated Bayesian inference of population size history from recombining sequence data.
Terhorst J. Terhorst J. bioRxiv [Preprint]. 2024 Mar 27:2024.03.25.586640. doi: 10.1101/2024.03.25.586640. bioRxiv. 2024. PMID: 38585997 Free PMC article. Preprint.

References

1. Adrion JR, Cole CB, Dukler N, Galloway JG, Gladstein AL, Gower G, Kyriazis CC, Ragsdale AP, Tsambos G, Baumdicker F, Carlson J, Cartwright RA, Durvasula A, Kim BY, McKenzie P, Messer PW, Noskova E, Vecchyo DO-D, Racimo F, Struck TJ, Gravel S, Gutenkunst RN, Lohmeuller KE, Ralph PL, Schrider DR, Siepel A, Kelleher J, and Kern AD (2019), “A community-maintained standard library of population genetic models,” bioRxiv,. - PMC - PubMed
1. Barry D, and Hartigan JA (1992), “Product partition models for change point problems,” The Annals of Statistics, pp. 260–279.
1. Barry D, and Hartigan JA (1993), “A Bayesian analysis for change point problems,” Journal of the American Statistical Association, 88(421), 309–319.
1. Bhaskar A, Wang YXR, and Song YS (2015), “Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data,” Genome Research, 25(2), 268–279. - PMC - PubMed
1. Bishop CM (2006), Pattern Recognition and Machine Learning, Berlin, Heidelberg: Springer-Verlag.

Grants and funding

R35 GM151145/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exact Decoding of a Sequentially Markov Coalescent Model in Genetics

Affiliation

Exact Decoding of a Sequentially Markov Coalescent Model in Genetics

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources