Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 3;35(12):2661-2670.
doi: 10.1101/gr.280567.125.

Pangenome-based genome inference using integer programming

Affiliations

Pangenome-based genome inference using integer programming

Ghanshyam Chandra et al. Genome Res. .

Abstract

Affordable genotyping methods are essential in genomics. Commonly used genotyping methods primarily support single-nucleotide variants and short indels but neglect structural variants. Additionally, accuracy of read alignments to a reference genome is unreliable in highly polymorphic and repetitive regions, further impacting genotyping performance. Recent works highlight the advantage of pangenome graphs in addressing these challenges. Building on these developments, we propose a rigorous alignment-free genotyping method. Our optimization framework identifies a path through the pangenome graph that maximizes the matches between the path and substrings of sequencing reads (e.g., k-mers) while minimizing recombination events (haplotype switches) along the path. We prove that this problem is NP-hard and develop efficient integer-programming solutions. We benchmark the algorithm using downsampled short-read data sets from homozygous human cell lines with coverage ranging from 0.1× to 10×. Our algorithm accurately estimates complete major histocompatibility complex (MHC) haplotype sequences with small edit distances from the ground-truth sequences, providing a significant advantage over existing methods on low-coverage inputs.

PubMed Disclaimer

Update of

References

    1. Baaijens JA, Bonizzoni P, Boucher C, Della Vedova G, Pirola Y, Rizzi R, Sirén J. 2022. Computational graph pangenomics: a tutorial on data structures and their applications. Nat Comput 21: 81–108. 10.1007/s11047-022-09882-6 - DOI - PMC - PubMed
    1. Bradbury PJ, Casstevens T, Jensen SE, Johnson L, Miller Z, Monier B, Romay M, Song B, Buckler ES. 2022. The practical haplotype graph, a platform for storing and using pangenomes for imputation. Bioinformatics 38: 3698–3702. 10.1093/bioinformatics/btac410 - DOI - PMC - PubMed
    1. Chandra G, Gibney D, Jain C. 2024. Haplotype-aware sequence alignment to pangenome graphs. Genome Res 34: 1265–1275. 10.1101/gr.279143.124 - DOI - PMC - PubMed
    1. Computational Pan-Genomics Consortium. 2018. Computational pan-genomics: status, promises and challenges. Brief Bioinformatics 19: 118–135. 10.1093/bib/bbw089 - DOI - PMC - PubMed
    1. Davies RW, Kucka M, Su D, Shi S, Flanagan M, Cunniff CM, Chan YF, Myers S. 2021. Rapid genotype imputation from sequence with reference panels. Nat Genet 53: 1104–1111. 10.1038/s41588-021-00877-0 - DOI - PMC - PubMed

LinkOut - more resources