Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul;25(5):e13854.
doi: 10.1111/1755-0998.13854. Epub 2023 Aug 21.

Best practices for genotype imputation from low-coverage sequencing data in natural populations

Affiliations

Best practices for genotype imputation from low-coverage sequencing data in natural populations

Marina M Watowich et al. Mol Ecol Resour. 2025 Jul.

Abstract

Monitoring genetic diversity in wild populations is a central goal of ecological and evolutionary genetics and is critical for conservation biology. However, genetic studies of nonmodel organisms generally lack access to species-specific genotyping methods (e.g. array-based genotyping) and must instead use sequencing-based approaches. Although costs are decreasing, high-coverage whole-genome sequencing (WGS), which produces the highest confidence genotypes, remains expensive. More economical reduced representation sequencing approaches fail to capture much of the genome, which can hinder downstream inference. Low-coverage WGS combined with imputation using a high-confidence reference panel is a cost-effective alternative, but the accuracy of genotyping using low-coverage WGS and imputation in nonmodel populations is still largely uncharacterized. Here, we empirically tested the accuracy of low-coverage sequencing (0.1-10×) and imputation in two natural populations, one with a large (n = 741) reference panel, rhesus macaques (Macaca mulatta), and one with a smaller (n = 68) reference panel, gelada monkeys (Theropithecus gelada). Using samples sequenced to coverage as low as 0.5×, we could impute genotypes at >95% of the sites in the reference panel with high accuracy (median r2 ≥ 0.92). We show that low-coverage imputed genotypes can reliably calculate genetic relatedness and population structure. Based on these data, we also provide best practices and recommendations for researchers who wish to deploy this approach in other populations, with all code available on GitHub (https://github.com/mwatowich/LoCSI-for-non-model-species). Our results endorse accurate and effective genotype imputation from low-coverage sequencing, enabling the cost-effective generation of population-scale genetic datasets necessary for tackling many pressing challenges of wildlife conservation.

Keywords: conservation; genotyping; imputation; next‐generation sequencing; population genetics.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors have no conflict of interest.

References

    1. Arnold B, Corbett-Detig RB, Hartl D, & Bomblies K (2013). RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling. Molecular Ecology, 22(11), 3179–3190. 10.1111/mec.12276 - DOI - PubMed
    1. Attard CRM, Beheregaray LB, & Möller LM (2018). Genotyping-by-sequencing for estimating relatedness in nonmodel organisms: Avoiding the trap of precise bias. Molecular Ecology Resources, 18(3), 381–390. - PubMed
    1. Bimber BN, Yan MY, Peterson SM, & Ferguson B (2019). mGAP: The macaque genotype and phenotype resource, a framework for accessing and interpreting macaque variant data, and identifying new models of human disease. BMC Genomics, 20(1), 176. 10.1186/s12864-019-5559-7 - DOI - PMC - PubMed
    1. Browning SR, & Browning BL (2007). Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering. The American Journal of Human Genetics, 81(5), 1084–1097. 10.1086/521987 - DOI - PMC - PubMed
    1. Buckley RM, Harris AC, Wang G-D, Whitaker DT, Zhang Y-P, & Ostrander EA (2022). Best practices for analyzing imputed genotypes from low-pass sequencing in dogs. Mammalian Genome, 33(1), 213–229. 10.1007/s00335-021-09914-z - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources