Critical assessment of missense variant effect predictors on disease-relevant variant data

Ruchir Rastogi¹, Ryan Chung², Sindy Li³, Chang Li⁴, Kyoungyeul Lee⁵, Junwoo Woo⁵, Dong-Wook Kim⁵, Changwon Keum⁵, Giulia Babbi⁶, Pier Luigi Martelli⁶, Castrense Savojardo⁶, Rita Casadio⁶, Kirsley Chennen⁷, Thomas Weber⁷, Olivier Poch⁷, François Ancien^{8

9}, Gabriel Cia^{8

9}, Fabrizio Pucci^{8

9}, Daniele Raimondi^{10

11}, Wim Vranken^{9

12}, Marianne Rooman^{8

9}, Céline Marquet¹³, Tobias Olenyi¹³, Burkhard Rost¹³, Gaia Andreoletti^{3

14}, Akash Kamandula¹⁵, Yisu Peng¹⁵, Constantina Bakolitsa³, Matthew Mort¹⁶, David N Cooper¹⁶, Timothy Bergquist¹⁷, Vikas Pejaver^{17

18}, Xiaoming Liu⁴, Predrag Radivojac¹⁵, Steven E Brenner^{19

20}, Nilah M Ioannidis^{21

22

23}

Affiliations

¹ Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. ruchir_rastogi@berkeley.edu.
² Center for Computational Biology, University of California, Berkeley, CA, USA.
³ Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA.
⁴ USF Genomics, College of Public Health, University of South Florida, Tampa, FL, USA.
⁵ 3billion Inc., Seoul, South Korea.
⁶ Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
⁷ University of Strasbourg, Strasbourg, France.
⁸ Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium.
⁹ Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.
¹⁰ ESAT-STADIUS, KU Leuven, Leuven, Belgium.
¹¹ Institut de Génétique Moléculaire de Montpellier, Université de Montpellier, Montpellier, France.
¹² Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
¹³ Department of Informatics, Bioinformatics and Computational Biology, Technical University of Munich, Munich, Germany.
¹⁴ Sage Bionetworks, Seattle, WA, USA.
¹⁵ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
¹⁶ Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, UK.
¹⁷ Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁸ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁹ Center for Computational Biology, University of California, Berkeley, CA, USA. brenner@compbio.berkeley.edu.
²⁰ Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA. brenner@compbio.berkeley.edu.
²¹ Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. nilah@berkeley.edu.
²² Center for Computational Biology, University of California, Berkeley, CA, USA. nilah@berkeley.edu.
²³ Chan Zuckerberg Biohub, San Francisco, CA, USA. nilah@berkeley.edu.

PMID: 40113603
PMCID: PMC11976771
DOI: 10.1007/s00439-025-02732-2

Critical assessment of missense variant effect predictors on disease-relevant variant data

Ruchir Rastogi et al. Hum Genet. 2025 Mar.

. 2025 Mar;144(2-3):281-293.

doi: 10.1007/s00439-025-02732-2. Epub 2025 Mar 21.

Authors

Affiliations

¹ Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. ruchir_rastogi@berkeley.edu.
² Center for Computational Biology, University of California, Berkeley, CA, USA.
³ Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA.
⁴ USF Genomics, College of Public Health, University of South Florida, Tampa, FL, USA.
⁵ 3billion Inc., Seoul, South Korea.
⁶ Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
⁷ University of Strasbourg, Strasbourg, France.
⁸ Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium.
⁹ Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.
¹⁰ ESAT-STADIUS, KU Leuven, Leuven, Belgium.
¹¹ Institut de Génétique Moléculaire de Montpellier, Université de Montpellier, Montpellier, France.
¹² Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
¹³ Department of Informatics, Bioinformatics and Computational Biology, Technical University of Munich, Munich, Germany.
¹⁴ Sage Bionetworks, Seattle, WA, USA.
¹⁵ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
¹⁶ Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, UK.
¹⁷ Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁸ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁹ Center for Computational Biology, University of California, Berkeley, CA, USA. brenner@compbio.berkeley.edu.
²⁰ Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA. brenner@compbio.berkeley.edu.
²¹ Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. nilah@berkeley.edu.
²² Center for Computational Biology, University of California, Berkeley, CA, USA. nilah@berkeley.edu.
²³ Chan Zuckerberg Biohub, San Francisco, CA, USA. nilah@berkeley.edu.

PMID: 40113603
PMCID: PMC11976771
DOI: 10.1007/s00439-025-02732-2

Abstract

Regular, systematic, and independent assessments of computational tools that are used to predict the pathogenicity of missense variants are necessary to evaluate their clinical and research utility and guide future improvements. The Critical Assessment of Genome Interpretation (CAGI) conducts the ongoing Annotate-All-Missense (Missense Marathon) challenge, in which missense variant effect predictors (also called variant impact predictors) are evaluated on missense variants added to disease-relevant databases following the prediction submission deadline. Here we assess predictors submitted to the CAGI 6 Annotate-All-Missense challenge, predictors commonly used in clinical genetics, and recently developed deep learning methods. We examine performance across a range of settings relevant for clinical and research applications, focusing on different subsets of the evaluation data as well as high-specificity and high-sensitivity regimes. Our evaluations reveal notable advances in current methods relative to older, well-cited tools in the field. While meta-predictors tend to outperform their constituent individual predictors, several newer individual predictors perform comparably to commonly used meta-predictors. Predictor performance varies between high-specificity and high-sensitivity regimes, highlighting that different methods may be optimal for different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors trained on pathogenicity labels from curated variant databases often inherit gene-level label imbalances. Our findings help illuminate the clinical and research utility of modern missense variant effect predictors and identify potential areas for future development.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: The authors declare no conflicts of interest.

Figures

**Fig. 1**
**Full ROC curve performance.** We show the ROC curves and AUROCs for meta-predictors (left) and individual predictors (right) on the full evaluation dataset. Predictors marked by diamonds use allele frequency as a feature. The black dashed lines at 5% FPR and 95% TPR demarcate the boundaries of the high-specificity and high-sensitivity regions, respectively, which are enlarged in Fig. 2

**Fig. 2**
**Performance in high-specificity and high-sensitivity regimes**. We show enlarged portions of the ROC curves from Fig. 1 to focus on (A) the high-specificity region ( $FPR \leq 5 %$ ) and (B) the high-sensitivity region ( $TPR \geq 95 %$ ) for meta-predictors (left) and individual predictors (right). We also show the normalized area under the curve in these regions (normalized such that a perfect classifier gets a score of 1 and a random classifier gets a score of 0.5). Predictors marked by diamonds use allele frequency as a feature

**Fig. 3**
**Allele frequency bias.** Top-performing predictors are evaluated for distinguishing benign variants in different allele frequency bins from pathogenic variants. All 6103 pathogenic variants were used in each evaluation, and benign variants were stratified by their allele frequencies obtained from the control cohort exomes in gnomAD v2.1.1 (Karczewski et al. 2020). Predictors marked by diamonds use allele frequency as a feature

**Fig. 4**
**Gene label balancing.** We constructed a gene label-balanced subset of our evaluation dataset containing an equal number of pathogenic and benign variants per gene. This label-balanced dataset consists of 2140 variants from 504 genes. Performance on the label-balanced dataset (y-axis) is compared to performance on the full dataset from Fig. 1 (x-axis) for meta-predictors (left) and individual predictors (right). Predictors marked by diamonds use allele frequency as a feature

See this image and copyright information in PMC

Update of

Critical assessment of missense variant effect predictors on disease-relevant variant data.
Rastogi R, Chung R, Li S, Li C, Lee K, Woo J, Kim DW, Keum C, Babbi G, Martelli PL, Savojardo C, Casadio R, Chennen K, Weber T, Poch O, Ancien F, Cia G, Pucci F, Raimondi D, Vranken W, Rooman M, Marquet C, Olenyi T, Rost B, Andreoletti G, Kamandula A, Peng Y, Bakolitsa C, Mort M, Cooper DN, Bergquist T, Pejaver V, Liu X, Radivojac P, Brenner SE, Ioannidis NM. Rastogi R, et al. bioRxiv [Preprint]. 2024 Jun 8:2024.06.06.597828. doi: 10.1101/2024.06.06.597828. bioRxiv. 2024. Update in: Hum Genet. 2025 Mar;144(2-3):281-293. doi: 10.1007/s00439-025-02732-2. PMID: 38895200 Free PMC article. Updated. Preprint.

References

1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR (2010) A method and server for predicting damaging missense mutations. Nature Methods 7(4):248–249 - PMC - PubMed
1. Alirezaie N, Kernohan KD, Hartley T, Majewski J, Hocking TD (2018) ClinPred: prediction tool to identify disease-relevant nonsynonymous single-nucleotide variants. The American Journal of Human Genetics 103(4):474–483 - PMC - PubMed
1. Ancien F, Pucci F, Godfroid M, Rooman M (2018) Prediction and interpretation of deleterious coding variants in terms of protein structural stability. Scientific Reports 8(1):4480 - PMC - PubMed
1. Bergquist T, Stenton SL, Nadeau EA, Byrne AB, Greenblatt MS, Harrison SM, Tavtigian SV, O’Donnell-Luria A, Biesecker LG, Radivojac P, et al. (2025) Calibration of additional computational tools expands ClinGen recommendation options for variant classification with PP3/BP4 criteria. Genetics in Medicine - PubMed
1. Brandes N, Goldman G, Wang CH, Ye CJ, Ntranos V (2023) Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics 55(9):1512–1522 - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Critical assessment of missense variant effect predictors on disease-relevant variant data

Affiliations

Critical assessment of missense variant effect predictors on disease-relevant variant data

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources