Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

Critical assessment of missense variant effect predictors on disease-relevant variant data

Ruchir Rastogi et al. bioRxiv. .

Update in

  • Critical assessment of missense variant effect predictors on disease-relevant variant data.
    Rastogi R, Chung R, Li S, Li C, Lee K, Woo J, Kim DW, Keum C, Babbi G, Martelli PL, Savojardo C, Casadio R, Chennen K, Weber T, Poch O, Ancien F, Cia G, Pucci F, Raimondi D, Vranken W, Rooman M, Marquet C, Olenyi T, Rost B, Andreoletti G, Kamandula A, Peng Y, Bakolitsa C, Mort M, Cooper DN, Bergquist T, Pejaver V, Liu X, Radivojac P, Brenner SE, Ioannidis NM. Rastogi R, et al. Hum Genet. 2025 Mar;144(2-3):281-293. doi: 10.1007/s00439-025-02732-2. Epub 2025 Mar 21. Hum Genet. 2025. PMID: 40113603 Free PMC article.

Abstract

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.

PubMed Disclaimer

Publication types

LinkOut - more resources