Fast simulation of identity-by-descent segments
- PMID: 40410602
- PMCID: PMC12102126
- DOI: 10.1007/s11538-025-01464-8
Fast simulation of identity-by-descent segments
Abstract
The worst-case runtime complexity to simulate haplotype segments identical by descent (IBD) is quadratic in sample size. We propose two main techniques to reduce the compute time, both of which are motivated by coalescent and recombination processes. We provide mathematical results that explain why our algorithm should outperform a naive implementation with high probability. In our experiments, we observe average compute times to simulate detectable IBD segments around a locus that scale approximately linearly in sample size and take a couple of seconds for sample sizes that are less than 10,000 diploid individuals. In contrast, we find that existing methods to simulate IBD segments take minutes to hours for sample sizes exceeding a few thousand diploid individuals. When using IBD segments to study recent positive selection around a locus, our efficient simulation algorithm makes feasible statistical inferences, e.g., parametric bootstrapping in analyses of large biobanks, that would be otherwise intractable.
Keywords: Coalescent; Computational runtime; Identity-by-descent.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Competing interests: The authors declare no competing interests. Ethics approval and consent to participate: Not applicable Consent for publication: Not applicable
Figures





Update of
-
Fast simulation of identity-by-descent segments.bioRxiv [Preprint]. 2025 Jan 7:2024.12.13.628449. doi: 10.1101/2024.12.13.628449. bioRxiv. 2025. Update in: Bull Math Biol. 2025 May 23;87(7):84. doi: 10.1007/s11538-025-01464-8. PMID: 39829821 Free PMC article. Updated. Preprint.
References
-
- Adrion JR, Cole CB, Dukler N, Galloway JG, Gladstein AL, Gower G, Kyriazis CC, Ragsdale AP, Tsambos G, Baumdicker F, Carlson J, Cartwright RA, Durvasula A, Gronau I, Kim BY, McKenzie P, Messer PW, Noskova E, Ortega-Del Vecchyo D, Racimo F, Struck TJ, Gravel S, Gutenkunst RN, Lohmueller KE, Ralph PL, Schrider DR, Siepel A, Kelleher J, Kern AD (2020) A community-maintained standard library of population genetic models. Elife 9 - PMC - PubMed
-
- Browning SR, Browning BL (2025) Estimating gene conversion rates from population data using multi-individual identity by descent. bioRxiv 10.1101/2025.02.22.639693
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources