Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 10;23(2):bbac025.
doi: 10.1093/bib/bbac025.

Systematic evaluation of computational tools to predict the effects of mutations on protein stability in the absence of experimental structures

Affiliations

Systematic evaluation of computational tools to predict the effects of mutations on protein stability in the absence of experimental structures

Qisheng Pan et al. Brief Bioinform. .

Abstract

Changes in protein sequence can have dramatic effects on how proteins fold, their stability and dynamics. Over the last 20 years, pioneering methods have been developed to try to estimate the effects of missense mutations on protein stability, leveraging growing availability of protein 3D structures. These, however, have been developed and validated using experimentally derived structures and biophysical measurements. A large proportion of protein structures remain to be experimentally elucidated and, while many studies have based their conclusions on predictions made using homology models, there has been no systematic evaluation of the reliability of these tools in the absence of experimental structural data. We have, therefore, systematically investigated the performance and robustness of ten widely used structural methods when presented with homology models built using templates at a range of sequence identity levels (from 15% to 95%) and contrasted performance with sequence-based tools, as a baseline. We found there is indeed performance deterioration on homology models built using templates with sequence identity below 40%, where sequence-based tools might become preferable. This was most marked for mutations in solvent exposed residues and stabilizing mutations. As structure prediction tools improve, the reliability of these predictors is expected to follow, however we strongly suggest that these factors should be taken into consideration when interpreting results from structure-based predictors of mutation effects on protein stability.

Keywords: AlphaFold2; homology modelling; mutation effects on protein stability; performance evaluation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Analysis workflow for assessing the performance of mutation effect predictors using homology models in different sequence identity ranges.
Figure 2
Figure 2
Distribution of (A) experimental ΔΔG values, (B) relative solvent accessibility (RSA), (C) mutation types, (D) root mean square deviation (RMSD) and (E) TM-score between homology models and experimental structures in eight homology model datasets of different identity levels. The RSA cutoff to define buried or exposed residues was 20% and shown as a blue dashed line in (B).
Figure 3
Figure 3
Overall performance trends based on Pearson’s correlation coefficient (R) of ten methods predicting mutation effects on protein stability, namely DDGun (brown), DUET (red), DynaMut1 (pink), DynaMut2 (green), ENCoM (orange), FoldX (blue), I-Mutant 2.0 (light blue), MAESTRO (purple), mCSM-Stability (yellow) and SDM (cyan). The R values and their trends on homology models are represented in dots and lines, respectively. A vertical long-dashed line indicates the proposed identity cutoff for homology modelling, whereas the horizontal lines are the baseline performance of four sequence-based methods, namely SAAFEC-SEQ (dotted), MUpro (dot-dashed), I-Mutant (dashed) and DDGun (long-dashed).
Figure 4
Figure 4
Performance trends based on Pearson’s correlation coefficient (R) of ten methods with mutations grouped based on four structure-based properties, namely (A) relative solvent accessibility (RSA), (B) residue depth, (C) secondary structure types and (D) structural class based on CATH. The performance trends of two main types of methods, namely machine learning based (ML) and Statistical/Energy function based (Non-ML), were displayed respectively. The RSA cutoff of 20% was used to determine buried or exposed residues. The residue depth cutoff of 2.2 Å was used to determine deep or shallow residues. Four secondary structure types, namely alpha helix, beta sheet, turn and random coil, were considered in this study. Three structural classifications, namely mainly alpha, mainly beta and mixed alpha/beta, were analysed.
Figure 5
Figure 5
Performance trends based on Pearson’s correlation coefficient (R) of ten methods with mutations grouped based on four sequence-based properties, namely (A) mutation type based on the change of polarity, (B) change of residue volume, (C) mutations related to Glycine and (D) mutation effects on protein stability.

Comment in

References

    1. Protasevich I, Yang Z, Wang C, et al. Thermal unfolding studies show the disease causing F508del mutation in CFTR thermodynamically destabilizes nucleotide-binding domain 1. Protein Sci 2010;19:1917–31. - PMC - PubMed
    1. Jafri M, Wake NC, Ascher DB, et al. Germline mutations in the CDKN2B tumor suppressor gene predispose to renal cell carcinoma. Cancer Discov 2015;5:723–9. - PubMed
    1. Usher JL, Ascher DB, Pires DE, et al. Analysis of HGD gene mutations in patients with Alkaptonuria from the United Kingdom: identification of novel mutations. JIMD Rep 2015;24:3–11. - PMC - PubMed
    1. Nemethova M, Radvanszky J, Kadasi L, et al. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on 'black bone disease' in Italy. Eur J Hum Genet 2016;24:66–72. - PMC - PubMed
    1. Pires DE, Chen J, Blundell TL, et al. In silico functional dissection of saturation mutagenesis: interpreting the relationship between phenotypes and changes in protein stability, interactions and activity. Sci Rep 2016;6:19848. - PMC - PubMed

Publication types