The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity
- PMID: 25684150
- PMCID: PMC4409520
- DOI: 10.1002/humu.22768
The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity
Abstract
Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen-2, SIFT, FatHMM, MutationTaster-2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools.
Keywords: exome sequencing; pathogenicity prediction tools.
© 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
Figures




Similar articles
-
REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.Am J Hum Genet. 2016 Oct 6;99(4):877-885. doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22. Am J Hum Genet. 2016. PMID: 27666373 Free PMC article.
-
Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies.Hum Mol Genet. 2015 Apr 15;24(8):2125-37. doi: 10.1093/hmg/ddu733. Epub 2014 Dec 30. Hum Mol Genet. 2015. PMID: 25552646 Free PMC article.
-
The CYSMA web server: An example of integrative tool for in silico analysis of missense variants identified in Mendelian disorders.Hum Mutat. 2020 Feb;41(2):375-386. doi: 10.1002/humu.23941. Epub 2019 Nov 15. Hum Mutat. 2020. PMID: 31674704
-
Congruency in the prediction of pathogenic missense mutations: state-of-the-art web-based tools.Brief Bioinform. 2013 Jul;14(4):448-59. doi: 10.1093/bib/bbt013. Epub 2013 Mar 15. Brief Bioinform. 2013. PMID: 23505257 Review.
-
A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.Brief Bioinform. 2020 Jul 15;21(4):1119-1135. doi: 10.1093/bib/bbz051. Brief Bioinform. 2020. PMID: 31204427 Free PMC article. Review.
Cited by
-
Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles.Annu Rev Ecol Evol Syst. 2022 Nov;53(1):113-136. doi: 10.1146/annurev-ecolsys-102320-093722. Epub 2022 Jul 29. Annu Rev Ecol Evol Syst. 2022. PMID: 38107485 Free PMC article.
-
Analysis of missense variants in the human genome reveals widespread gene-specific clustering and improves prediction of pathogenicity.Am J Hum Genet. 2022 Mar 3;109(3):457-470. doi: 10.1016/j.ajhg.2022.01.006. Epub 2022 Feb 3. Am J Hum Genet. 2022. PMID: 35120630 Free PMC article.
-
ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants.Am J Hum Genet. 2018 Oct 4;103(4):474-483. doi: 10.1016/j.ajhg.2018.08.005. Epub 2018 Sep 13. Am J Hum Genet. 2018. PMID: 30220433 Free PMC article.
-
SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants.J Pers Med. 2022 Feb 11;12(2):263. doi: 10.3390/jpm12020263. J Pers Med. 2022. PMID: 35207751 Free PMC article.
-
LYRUS: a machine learning model for predicting the pathogenicity of missense variants.Bioinform Adv. 2021 Dec 25;2(1):vbab045. doi: 10.1093/bioadv/vbab045. eCollection 2022. Bioinform Adv. 2021. PMID: 35036922 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources