Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Apr;604(7906):437-446.
doi: 10.1038/s41586-022-04601-8. Epub 2022 Apr 20.

The Human Pangenome Project: a global resource to map genomic diversity

Affiliations
Review

The Human Pangenome Project: a global resource to map genomic diversity

Ting Wang et al. Nature. 2022 Apr.

Abstract

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest

None declared.

Figures

Figure 1:
Figure 1:. The Human Pangenome Reference Consortium.
This diagram provides an overview of the HPRC’s several components. Collect: I,000 Genomes samples jump-start the project and will be followed by additional samples collected through community engagement and recruitment. Sample selection efforts will ensure the graph-based reference captures global human genomic diversity. Sequence: Long-read and long-range technologies are used to generate genome graphs and bridge gaps in difficult-to-assemble genomic regions. Assemble: Telomere-to-telomere finished diploid genomes will foster variant discovery, especially in complex, difficult to assemble genomic regions. Construct: Scalable bioinformatics approaches assemble, QC, call variants, and benchmark graph assembly accuracy. The graph is annotated with gene descriptions and transcriptome data, making it more accessible and interpretable. Utilize: Collaboration across scientific and stakeholder communities will create a new ecosystem of analysis tools. Clinical applications and research use will involve analysis, validation, interpretation, and publication of results. Outreach: Members of the HPRC Outreach community engage and educate the user community and broadly share all genomic products and informatics platforms. ELSI: ELSI scholars will develop selection processes and policy frameworks that meet investigator needs while respecting research partner autonomy and cultural norms.
Figure 2:
Figure 2:. Standards were developed through a pilot benchmark study of one individual.
Multiple long-read and long-range technologies and computational methods were evaluated to develop the combination of platforms and an automated pipeline that provides the most complete and accurate genome graph.
Figure 3:
Figure 3:. The Human Pangenome Reference.
Graph-aware mappers can be used to genotype samples by directly mapping against the graph. This simplified example shows how to create a pangenome graph for four people and calculate the allele frequency of three variants. Iterating through each individual produces the graph structure, which improves as new genomes are added. Genomic data is arranged into a sequence variation map based on edges. Alternative haplotypes are depicted as alternate pathways across the graph, with the edges being the primary data-bearing elements. The pangenome reference catalogs genomic variation and allows for population- scale analysis thanks to its graph structure. Tracing a path through the network and connecting sequences at access edges yields haplotypes for individuals. For clinical interpretation, allele frequencies are reported.

Similar articles

  • Semi-automated assembly of high-quality diploid human reference genomes.
    Jarvis ED, Formenti G, Rhie A, Guarracino A, Yang C, Wood J, Tracey A, Thibaud-Nissen F, Vollger MR, Porubsky D, Cheng H, Asri M, Logsdon GA, Carnevali P, Chaisson MJP, Chin CS, Cody S, Collins J, Ebert P, Escalona M, Fedrigo O, Fulton RS, Fulton LL, Garg S, Gerton JL, Ghurye J, Granat A, Green RE, Harvey W, Hasenfeld P, Hastie A, Haukness M, Jaeger EB, Jain M, Kirsche M, Kolmogorov M, Korbel JO, Koren S, Korlach J, Lee J, Li D, Lindsay T, Lucas J, Luo F, Marschall T, Mitchell MW, McDaniel J, Nie F, Olsen HE, Olson ND, Pesout T, Potapova T, Puiu D, Regier A, Ruan J, Salzberg SL, Sanders AD, Schatz MC, Schmitt A, Schneider VA, Selvaraj S, Shafin K, Shumate A, Stitziel NO, Stober C, Torrance J, Wagner J, Wang J, Wenger A, Xiao C, Zimin AV, Zhang G, Wang T, Li H, Garrison E, Haussler D, Hall I, Zook JM, Eichler EE, Phillippy AM, Paten B, Howe K, Miga KH; Human Pangenome Reference Consortium. Jarvis ED, et al. Nature. 2022 Nov;611(7936):519-531. doi: 10.1038/s41586-022-05325-5. Epub 2022 Oct 19. Nature. 2022. PMID: 36261518 Free PMC article.
  • Perspectives and opportunities in forensic human, animal, and plant integrative genomics in the Pangenome era.
    He G, Liu C, Wang M. He G, et al. Forensic Sci Int. 2025 Feb;367:112370. doi: 10.1016/j.forsciint.2025.112370. Epub 2025 Jan 12. Forensic Sci Int. 2025. PMID: 39813779 Review.
  • Beyond the Human Genome Project: The Age of Complete Human Genome Sequences and Pangenome References.
    Taylor DJ, Eizenga JM, Li Q, Das A, Jenike KM, Kenny EE, Miga KH, Monlong J, McCoy RC, Paten B, Schatz MC. Taylor DJ, et al. Annu Rev Genomics Hum Genet. 2024 Aug;25(1):77-104. doi: 10.1146/annurev-genom-021623-081639. Epub 2024 Aug 6. Annu Rev Genomics Hum Genet. 2024. PMID: 38663087 Free PMC article. Review.
  • A draft human pangenome reference.
    Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ, Buonaiuto S, Chang XH, Cheng H, Chu J, Colonna V, Eizenga JM, Feng X, Fischer C, Fulton RS, Garg S, Groza C, Guarracino A, Harvey WT, Heumos S, Howe K, Jain M, Lu TY, Markello C, Martin FJ, Mitchell MW, Munson KM, Mwaniki MN, Novak AM, Olsen HE, Pesout T, Porubsky D, Prins P, Sibbesen JA, Sirén J, Tomlinson C, Villani F, Vollger MR, Antonacci-Fulton LL, Baid G, Baker CA, Belyaeva A, Billis K, Carroll A, Chang PC, Cody S, Cook DE, Cook-Deegan RM, Cornejo OE, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld AL, Formenti G, Frankish A, Gao Y, Garrison NA, Giron CG, Green RE, Haggerty L, Hoekzema K, Hourlier T, Ji HP, Kenny EE, Koenig BA, Kolesnikov A, Korbel JO, Kordosky J, Koren S, Lee H, Lewis AP, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson ND, Popejoy AB, Puiu D, Rautiainen M, Regier AA, Rhie A, Sacco S, Sanders AD, Schneider VA, Schultz BI, Shafin K, Smith MW, Sofia HJ, Abou Tayoun AN, Thibaud-Nissen F, Tricomi FF, Wagner J, Walenz B, Wood JMD, Zimin AV, Bourque G, Chaisson MJP, Flicek P, Phillippy AM, Zook JM, Eichler EE, … See abstract for full author list ➔ Liao WW, et al. Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10. Nature. 2023. PMID: 37165242 Free PMC article.
  • Telomere-to-telomere assembly of diploid chromosomes with Verkko.
    Rautiainen M, Nurk S, Walenz BP, Logsdon GA, Porubsky D, Rhie A, Eichler EE, Phillippy AM, Koren S. Rautiainen M, et al. Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16. Nat Biotechnol. 2023. PMID: 36797493 Free PMC article.

Cited by

References

    1. Lander ES et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921, doi:10.1038/35057062 (2001). - DOI - PubMed
    1. Venter JC et al. The sequence of the human genome. Science 291, 1304–1351. (2001). - PubMed
    1. Gibbs RA The Human Genome Project changed everything. Nat Rev Genet, doi:10.1038/s41576-020-0275-3 (2020). - DOI - PMC - PubMed
    1. Venter JC et al. The sequence of the human genome. Science 291, 1304–1351, doi:10.1126/science.1058040 (2001). - DOI - PubMed
    1. Green RE et al. A draft sequence of the Neandertal genome. Science 328, 710–722, doi:10.1126/science.1188021 (2010). - DOI - PMC - PubMed

Key References

    1. • 61 Cheng H, Concepcion GT, Feng X, Zhang H & Li H Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175, doi:10.1038/s41592-020-01056-5 (2021).

      o Hifiasm is a haplotype-resolved assembler specifically designed for PacBio HiFi reads, that aims to represent haplotype information in a phased assembly graph.

    1. • 22 Nurk S et al. The complete sequence of a human genome. bioRxiv, 2021.2005.2026.445798, doi:10.1101/2021.05.26.445798 (2021).

      o The first complete genome assembly issued from the Telomere-to-telomere (T2T) Consortium, which closed all remaining gaps in the GRCh38 including all acrocentric short arms, segmental duplications, and human centromeric regions.

    1. • 20 Miga KH et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84, doi:10.1038/s41586-020-2547-7 (2020).

      o The sequence of the first complete human chromosome.

    1. • 64 Li H, Feng X & Chu C The design and construction of reference pangenome graphs with minigraph. Genome Biol 21, 265, doi:10.1186/s13059-020-02168-z (2020).

      o Minigraph toolkit used to efficiently construct a pangenome graph, useful for mapping and for constructing graphs encoding structural variation.

    1. • 70 Paten B et al. Cactus: Algorithms for genome multiple sequence alignment. Genome Res 21, 1512–1528, doi:10.1101/gr.123356.111 (2011).

      o Cactus is a highly accurate, reference-free multiple genome alignment program useful to study general rearrangement and copy number variation.

Publication types