Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 14;8(1):4480.
doi: 10.1038/s41598-018-22531-2.

Prediction and interpretation of deleterious coding variants in terms of protein structural stability

Affiliations

Prediction and interpretation of deleterious coding variants in terms of protein structural stability

François Ancien et al. Sci Rep. .

Abstract

The classification of human genetic variants into deleterious and neutral is a challenging issue, whose complexity is rooted in the large variety of biophysical mechanisms that can be responsible for disease conditions. For non-synonymous mutations in structured proteins, one of these is the protein stability change, which can lead to loss of protein structure or function. We developed a stability-driven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing. Our predictor yields a balanced accuracy of 71% in cross validation. As expected, it has a very high positive predictive value of 89%: it predicts with high accuracy the subset of mutations that are deleterious because of stability issues, but is by construction unable of classifying variants that are deleterious for other reasons. Its combination with an evolutionary-based predictor increases the balanced accuracy up to 75%, and allowed predicting more than 1/4 of the variants with 95% positive predictive value. Our method, called SNPMuSiC, can be used with both experimental and modeled structures and compares favorably with other prediction tools on several independent test sets. It constitutes a step towards interpreting variant effects at the molecular scale. SNPMuSiC is freely available at https://soft.dezyme.com/ .

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Schematic representation of: (a) the artificial neural network and (b) the probabilistic neural network used in the classification of the variants.
Figure 2
Figure 2
Probability density distributions of deleterious mutations (red curve) and neutral mutations (blue curve) for: (a) the solvent accessibility A (0–100%) of the mutated residues; (b) the change in volume ΔV (in Å3) for residues with A ≤ 60%; (c) the change in volume ΔV for residues with A > 60%; in our conventions, mutations of smaller into larger residues have a positive ΔV value.
Figure 3
Figure 3
Probability density distributions of deleterious mutations (red curve) and neutral mutations (blue curve) for the changes in folding free energy ΔΔW (in kcal/mol) computed with the following statistical potentials: (a) the distance potential ΔWsd; (b) the distance potential ΔWsds; (c) the torsion angle and solvent accessibility potential ΔWsta; in our conventions, positive ΔΔW values correspond to destabilizing mutations.
Figure 4
Figure 4
Probability density distributions of deleterious mutations (red curve) and neutral mutations (blue curve) for (a) the change in folding free energy ΔΔG computed by PoPMuSiC (Eq. (3)) (in kcal/mol), (b) the Provean score,, and (c) the pathogenicity index I computed by the ANN model (Eq. (8)).
Figure 5
Figure 5
Probability density distribution of deleterious mutations (red curve) and neutral mutations (blue curve) for the pathological index J (Eq. (11)) computed by SNPMuSiC. The distribution curves in the high confidence intervals, which lie from either side of the two vertical lines, are depicted on a white background.

Similar articles

Cited by

References

    1. Stenson PD, et al. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Human Genetics. 2017;136:665–677. doi: 10.1007/s00439-017-1779-6. - DOI - PMC - PubMed
    1. Niu B, et al. Protein-structure-guided discovery of functional mutations across 19 cancer types. Nature Genetics. 2016;48:827–837. doi: 10.1038/ng.3586. - DOI - PMC - PubMed
    1. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the Functional Effect of Amino Acid Substitutions and Indels. PLoS One. 2012;7:e46688. doi: 10.1371/journal.pone.0046688. - DOI - PMC - PubMed
    1. Choi Y, Chan AP. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics. 2015;31:2745–2747. doi: 10.1093/bioinformatics/btv195. - DOI - PMC - PubMed
    1. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 2009;4:1073–1081. doi: 10.1038/nprot.2009.86. - DOI - PubMed