Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep 6;46(15):7793-7804.
doi: 10.1093/nar/gky678.

Performance evaluation of pathogenicity-computation methods for missense variants

Affiliations

Performance evaluation of pathogenicity-computation methods for missense variants

Jinchen Li et al. Nucleic Acids Res. .

Abstract

With expanding applications of next-generation sequencing in medical genetics, increasing computational methods are being developed to predict the pathogenicity of missense variants. Selecting optimal methods can accelerate the identification of candidate genes. However, the performances of different computational methods under various conditions have not been completely evaluated. Here, we compared 12 performance measures of 23 methods based on three independent benchmark datasets: (i) clinical variants from the ClinVar database related to genetic diseases, (ii) somatic variants from the IARC TP53 and ICGC databases related to human cancers and (iii) experimentally evaluated PPARG variants. Some methods showed different performances under different conditions, suggesting that they were not always applicable for different conditions. Furthermore, the specificities were lower than the sensitivities for most methods (especially, for the experimentally evaluated benchmark datasets), suggesting that more rigorous cutoff values are necessary to distinguish pathogenic variants. Furthermore, REVEL, VEST3 and the combination of both methods (i.e. ReVe) showed the best overall performances with all the benchmark data. Finally, we evaluated the performances of these methods with de novo mutations, finding that ReVe consistently showed the best performance. We have summarized the performances of different methods under various conditions, providing tentative guidance for optimal tool selection.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overall performance of the computational methods with the three sets of benchmark data. The AUC, hser-AUC and hspr-AUC of all computational methods are shown, based on germline variants of human genetic diseases from the ClinVar database (A–C), somatic variants of human cancers from the IARC TP53 database (D–F) and experimentally validated PPARG variants (G–I). The AUC, hser-AUC and hspr-AUC values for each computational method are shown in the figures. The solid lines represent function-prediction methods, the dashed lines represent conservation methods and the dotted lines represent ensemble methods. The performance measures of AUC, hser-AUC and hspr-AUC does not rely on the cutoff values. This figure is online available interactively at http://159.226.67.237/sun/roc/.
Figure 2.
Figure 2.
Performance evaluations based on DNMs. The OR, 95% confidence interval and P-values were calculated by Poisson's ratio test. The area of each ball is proportional to the number of missense variants predicted to be deleterious or benign. The orange balls represent function-prediction methods, the dark gray balls represent conservation methods, and the green balls represent ensemble methods. A given missense variant with a predictive score of ReVe greater than 0.86 was regarded as a deleterious variant. *, P < 0.05; **, P < 0.01; ***, P < 0.001.

References

    1. Rabbani B., Tekin M., Mahdieh N.. The promise of whole-exome sequencing in medical genetics. J. Hum. Genet. 2014; 59:5–15. - PubMed
    1. Goodwin S., McPherson J.D., McCombie W.R.. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016; 17:333–351. - PMC - PubMed
    1. Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., O’Donnell-Luria A.H., Ware J.S., Hill A.J., Cummings B.B. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016; 536:285–291. - PMC - PubMed
    1. Boycott K.M., Vanstone M.R., Bulman D.E., MacKenzie A.E.. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat. Rev. Genet. 2013; 14:681–691. - PubMed
    1. MacArthur D.G., Manolio T.A., Dimmock D.P., Rehm H.L., Shendure J., Abecasis G.R., Adams D.R., Altman R.B., Antonarakis S.E., Ashley E.A. et al. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014; 508:469–476. - PMC - PubMed

Publication types

MeSH terms