Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies
- PMID: 25552646
- PMCID: PMC4375422
- DOI: 10.1093/hmg/ddu733
Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies
Abstract
Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Figures


Similar articles
-
REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.Am J Hum Genet. 2016 Oct 6;99(4):877-885. doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22. Am J Hum Genet. 2016. PMID: 27666373 Free PMC article.
-
dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations.Hum Mutat. 2013 Sep;34(9):E2393-402. doi: 10.1002/humu.22376. Epub 2013 Jul 10. Hum Mutat. 2013. PMID: 23843252 Free PMC article.
-
dbNSFP v3.0: A One-Stop Database of Functional Predictions and Annotations for Human Nonsynonymous and Splice-Site SNVs.Hum Mutat. 2016 Mar;37(3):235-41. doi: 10.1002/humu.22932. Epub 2016 Jan 5. Hum Mutat. 2016. PMID: 26555599 Free PMC article.
-
dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs.Genome Med. 2020 Dec 2;12(1):103. doi: 10.1186/s13073-020-00803-9. Genome Med. 2020. PMID: 33261662 Free PMC article.
-
Congruency in the prediction of pathogenic missense mutations: state-of-the-art web-based tools.Brief Bioinform. 2013 Jul;14(4):448-59. doi: 10.1093/bib/bbt013. Epub 2013 Mar 15. Brief Bioinform. 2013. PMID: 23505257 Review.
Cited by
-
Deep Genetic Connection Between Cancer and Developmental Disorders.Hum Mutat. 2016 Oct;37(10):1042-50. doi: 10.1002/humu.23040. Epub 2016 Aug 23. Hum Mutat. 2016. PMID: 27363847 Free PMC article.
-
Biallelic Mutations in TBCD, Encoding the Tubulin Folding Cofactor D, Perturb Microtubule Dynamics and Cause Early-Onset Encephalopathy.Am J Hum Genet. 2016 Oct 6;99(4):962-973. doi: 10.1016/j.ajhg.2016.08.003. Epub 2016 Sep 22. Am J Hum Genet. 2016. PMID: 27666370 Free PMC article.
-
Towards Increasing the Clinical Relevance of In Silico Methods to Predict Pathogenic Missense Variants.PLoS Comput Biol. 2016 May 12;12(5):e1004725. doi: 10.1371/journal.pcbi.1004725. eCollection 2016 May. PLoS Comput Biol. 2016. PMID: 27171182 Free PMC article. Review. No abstract available.
-
Exome Sequencing Implicates Impaired GABA Signaling and Neuronal Ion Transport in Trigeminal Neuralgia.iScience. 2020 Sep 11;23(10):101552. doi: 10.1016/j.isci.2020.101552. eCollection 2020 Oct 23. iScience. 2020. PMID: 33083721 Free PMC article.
-
Cystinuria Associated with Different SLC7A9 Gene Variants in the Cat.PLoS One. 2016 Jul 12;11(7):e0159247. doi: 10.1371/journal.pone.0159247. eCollection 2016. PLoS One. 2016. PMID: 27404572 Free PMC article.
References
-
- Ng P.C., Henikoff S. (2006) Predicting the effects of amino acid substitutions on protein function. Annu. Rev. Genomics Hum. Genet., 7, 61–80. - PubMed
-
- Thusberg J., Vihinen M. (2009) Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum. Mutat., 30, 703–714. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous