Using large-scale population-based data to improve disease risk assessment of clinical variants
- PMID: 40551016
- PMCID: PMC12321300
- DOI: 10.1038/s41588-025-02212-3
Using large-scale population-based data to improve disease risk assessment of clinical variants
Abstract
Understanding the disease risk of genetic variants is fundamental to precision medicine. Estimates of penetrance-the probability of disease for individuals with a variant allele-rely on disease-specific cohorts, clinical testing and emerging electronic health record (EHR)-linked biobanks. These data sources, while valuable, each have limitations in quality, representativeness and analyzability. Here, we provide a historical account of the currently accepted pathogenicity classification system and data available in ClinVar, a public archive that aggregates variant interpretations but lacks detailed data for accurate penetrance assessment, highlighting its oversimplification of disease risk. We propose an integrative Bayesian framework that unifies pathogenicity and penetrance, leveraging both functional and real-world evidence to refine risk predictions. In addition, we advocate for enhancing ClinVar with the inclusion of high-priority phenotypes, age-stratified data and population-based cohorts linked to EHRs. We suggest developing a community repository of population-based penetrance estimates to support the clinical application of genetic data.
© 2025. Springer Nature America, Inc.
Conflict of interest statement
Competing interests: R.D. reported being a scientific cofounder, consultant and equity holder for Pensieve Health (pending) and a consultant for Variant Bio and Character Bio. J.M.E. reported being a cofounder, board member and executive of the nonprofit Center for Genomic Interpretation, with part of its mission overlapping with the interests of this work, specifically the mission to encourage careful stewardship of clinical genetics. J.M.E. is also the founder of and a consultant for Grandview Consulting LLC, not related to this work. K.-L.H. is a founder of Open Box Science, not related to this work. W.K.C. is on the Board of Directors of Prime Medicine and Rallybio, not related to this work. The other authors declare no competing interests.
Figures
References
-
- CDC. Tier 1 Genomics Applications and their Importance to Public Health. Office of Genomics and Precision Public Health https://www.cdc.gov/genomics/implementation/toolkit/tier1.htm (2014).
-
- Sturm AC et al. Clinical Genetic Testing for Familial Hypercholesterolemia: JACC Scientific Expert Panel. J. Am. Coll. Cardiol. 72, 662–680 (2018). - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
