Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun;2(2):lqaa015.
doi: 10.1093/nargab/lqaa015. Epub 2020 Mar 5.

Mutation effect estimation on protein-protein interactions using deep contextualized representation learning

Affiliations

Mutation effect estimation on protein-protein interactions using deep contextualized representation learning

Guangyu Zhou et al. NAR Genom Bioinform. 2020 Jun.

Abstract

The functional impact of protein mutations is reflected on the alteration of conformation and thermodynamics of protein-protein interactions (PPIs). Quantifying the changes of two interacting proteins upon mutations is commonly carried out by computational approaches. Hence, extensive research efforts have been put to the extraction of energetic or structural features on proteins, followed by statistical learning methods to estimate the effects of mutations on PPI properties. Nonetheless, such features require extensive human labors and expert knowledge to obtain, and have limited abilities to reflect point mutations. We present an end-to-end deep learning framework, MuPIPR (Mutation Effects in Protein-protein Interaction PRediction Using Contextualized Representations), to estimate the effects of mutations on PPIs. MuPIPR incorporates a contextualized representation mechanism of amino acids to propagate the effects of a point mutation to surrounding amino acid representations, therefore amplifying the subtle change in a long protein sequence. On top of that, MuPIPR leverages a Siamese residual recurrent convolutional neural encoder to encode a wild-type protein pair and its mutation pair. Multi-layer perceptron regressors are applied to the protein pair representations to predict the quantifiable changes of PPI properties upon mutations. Experimental evaluations show that, with only sequence information, MuPIPR outperforms various state-of-the-art systems on estimating the changes of binding affinity for SKEMPI v1, and offers comparable performance on SKEMPI v2. Meanwhile, MuPIPR also demonstrates state-of-the-art performance on estimating the changes of buried surface areas. The software implementation is available at https://github.com/guangyu-zhou/MuPIPR.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Architecture of MuPIPR.
Figure 2.
Figure 2.
The detailed framework of the two encoders.
Figure 3.
Figure 3.
The performance of different predictors on SKEMPI v2 based upon different thresholds of sequence similarity (by the log E-value) when compared with the SKEMPI v1 training set. The number of samples in the four bins are 128, 78, 104 and 177, respectively.
Figure 4.
Figure 4.
Correlations between predicted and experimental ΔΔG values for different types of mutated amino acids (i.e. ‘ALA’ and ‘NonALA’) in the SKP1102s dataset.
Figure 5.
Figure 5.
Boxplots of prediction errors for different mutant types from the SKP1400m dataset.
Figure 6.
Figure 6.
Scatter plot to compare the absolute errors by the complete version of MuPIPR and MuPIPR-static variant. The majority of points (formula image) are below the diagonal line, showing the predictions by the complete model to be generally more accurate.
Figure 7.
Figure 7.
Mutation effects on structures and BSA. The structures of Chain A and Chain B of the Human Insulin protein complex are depicted respectively on the left and right of the complex. The mutation is highlighted on Chain B. The wild-type (2HIU) and mutant (2M2P) complexes are retrieved from PDB.
Figure 8.
Figure 8.
Performance evaluation on using different hyperparameters on the SKP1400m dataset for ΔΔG estimation. The Pearson’s correlation coefficient (Corr) and the root mean square error (RMSE) are reported in blue squares (left) and red triangles (right), respectively.

Similar articles

Cited by

References

    1. Gonzalez M.W., Kann M.G.. Protein interactions and disease. PLoS Comput. Biol. 2012; 8:e1002819. - PMC - PubMed
    1. Rebsamen M., Kandasamy R.K., Superti-Furga G.. Protein interaction networks in innate immunity. Trends Immunol. 2013; 34:610–619. - PubMed
    1. Lorch M., Mason J.M., Clarke A.R., Parker M.J.. Effects of core mutations on the folding of a β-sheet protein: implications for backbone organization in the I-state. Biochemistry. 1999; 38:1377–1385. - PubMed
    1. Lorch M., Mason J.M., Sessions R.B., Clarke A.R.. Effects of mutations on the thermodynamics of a protein folding reaction: implications for the mechanism of formation of the intermediate and transition states. Biochemistry. 2000; 39:3480–3485. - PubMed
    1. Alfalah M., Keiser M., Leeb T., Zimmer K.-P., Naim H.Y.. Compound heterozygous mutations affect protein folding and function in patients with congenital sucrase-isomaltase deficiency. Gastroenterology. 2009; 136:883–892. - PubMed