The Platinum Pedigree: a long-read benchmark for genetic variants
- PMID: 40759746
- PMCID: PMC12533576
- DOI: 10.1038/s41592-025-02750-y
The Platinum Pedigree: a long-read benchmark for genetic variants
Abstract
Recent advances in genome sequencing have improved variant calling in complex regions of the human genome. However, it is difficult to quantify variant calling performance because existing standards often focus on specificity, neglecting completeness in difficult-to-analyze regions. To create a more comprehensive truth set, we used Mendelian inheritance in a large pedigree (CEPH-1463) to filter variants across PacBio high-fidelity (HiFi), Illumina and Oxford Nanopore Technologies platforms. This generated a variant map with over 4.7 million single-nucleotide variants, 767,795 insertions and deletions (indels), 537,486 tandem repeats and 24,315 structural variants, covering 2.77 Gb of the GRCh38 genome. This work adds ~200 Mb of high-confidence regions, including 8% more small variants, and introduces the first tandem repeat and structural variant truth sets for NA12878 and her family. As an example of the value of this improved benchmark, we retrained DeepVariant using these data to reduce genotyping errors by ~34%.
© 2025. The Author(s), under exclusive licence to Springer Nature America, Inc.
Conflict of interest statement
Competing interests: Z.K., C.N., T.M., W.J.R., S.L., E.D., J.M.H., C.T.S., K.P.C., C.F., C.L., X.C. and M.A.E. are employees and shareholders of PacBio. Z.K. holds private equity in Phase Genomics. P.-C.C. and A.C. are employees and shareholders of Google LLC. E.E.E. is a scientific advisory board (SAB) member of Variant Bio, Inc. All other authors have no competing interests.
References
-
- Chen Xiao, Harting John, Farrow Emily, Thiffault Isabelle, Kasperaviciute Dalia, Genomics England Research Consortium, Hoischen Alexander, Gilissen Christian, Pastinen Tomi, and Eberle Michael A.. 2023. “Comprehensive SMN1 and SMN2 Profiling for Spinal Muscular Atrophy Analysis Using Long-Read PacBio HiFi Sequencing.” The American Journal of Human Genetics 110 (2): 240–50. - PMC - PubMed
-
- Chen Xiao, Sanchis-Juan Alba, French Courtney E., Connell Andrew J., Delon Isabelle, Kingsbury Zoya, Chawla Aditi, et al. 2020. “Spinal Muscular Atrophy Diagnosis and Carrier Screening from Genome Sequencing Data.” Genetics in Medicine: Official Journal of the American College of Medical Genetics 22 (5): 945–53. - PMC - PubMed
MeSH terms
Grants and funding
- R00HG011657/U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- R35GM118335/U.S. Department of Health & Human Services | NIH | Center for Information Technology (Center for Information Technology, National Institutes of Health)
- R01 HG010169/HG/NHGRI NIH HHS/United States
- R01 HG002385/HG/NHGRI NIH HHS/United States
- Intramural Funding/United States Department of Commerce | National Institute of Standards and Technology (NIST)
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous
