Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 20;24(5):bbad310.
doi: 10.1093/bib/bbad310.

MpbPPI: a multi-task pre-training-based equivariant approach for the prediction of the effect of amino acid mutations on protein-protein interactions

Affiliations

MpbPPI: a multi-task pre-training-based equivariant approach for the prediction of the effect of amino acid mutations on protein-protein interactions

Yang Yue et al. Brief Bioinform. .

Abstract

The accurate prediction of the effect of amino acid mutations for protein-protein interactions (PPI $\Delta \Delta G$) is a crucial task in protein engineering, as it provides insight into the relevant biological processes underpinning protein binding and provides a basis for further drug discovery. In this study, we propose MpbPPI, a novel multi-task pre-training-based geometric equivariance-preserving framework to predict PPI $\Delta \Delta G$. Pre-training on a strictly screened pre-training dataset is employed to address the scarcity of protein-protein complex structures annotated with PPI $\Delta \Delta G$ values. MpbPPI employs a multi-task pre-training technique, forcing the framework to learn comprehensive backbone and side chain geometric regulations of protein-protein complexes at different scales. After pre-training, MpbPPI can generate high-quality representations capturing the effective geometric characteristics of labeled protein-protein complexes for downstream $\Delta \Delta G$ predictions. MpbPPI serves as a scalable framework supporting different sources of mutant-type (MT) protein-protein complexes for flexible application. Experimental results on four benchmark datasets demonstrate that MpbPPI is a state-of-the-art framework for PPI $\Delta \Delta G$ predictions. The data and source code are available at https://github.com/arantir123/MpbPPI.

Keywords: equivariant neural network; multi-task pre-training; protein binding affinity change prediction; protein engineering.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The flowchart of the MpbPPI framework. For each pre-training and downstream sample point, MpbPPI generates the residue-level KNN and radius contact graphs, which contain different-scale residue backbone and side chain geometric information of the corresponding protein–protein complex structure (see Methods section for details). In the pre-training phase (A), the proposed GEE encoder learns the geometric regulations of protein–protein complexes through our defined four geometric property-related denoising/recovery tasks. After that, MpbPPI uses a GBT-based decoder to predict PPI formula image for a WT–MT complex pair based on their encoded representations (B).
Figure 2
Figure 2
Panel (A) illustrates the MpbPPI data flow in the pre-training phase. In this phase, the pre-training protein–protein complex (residue number: N) represented by the KNN (edge number: EK) and radius contact graphs (edge number: ER) is sent to a five-layer GEE encoder. Based on message propagation, the encoder outputs updated embeddings of every residue node in current complex, which will be sent to four multi-layer perceptions (MLPs) specific to different pre-training tasks simultaneously, for guiding the model optimization. The input/output dimensions of each intermediate layer are shown around this layer. For downstream formula image prediction phase (B), WT and mutant PPI structures represented by the same type of contact graphs as above are sent to the trained GEE to produce separate residue node embedding sets, which are then sent to the GBT-based decoder to predict the final formula image for current sample point (see Methods section). Panel (C) illustrates the basic message propagation scheme in each GEE layer, in which the similar operations will be performed to each (central) residue node in the protein–protein complex.
Figure 3
Figure 3
MpbPPI outperformed other involved methods for PPI formula image prediction under the five-time WT protein–protein complex-based cross-validations. We reported the experimental results on each dataset based on the main evaluation metrics formula image. For the machine learning-based methods, the results were expressed as mean ± SD, while for the empirical energy-based methods, the results were expressed as the mean value. MpbPPI (Backb+Sidec+SASA+AA) and MpbPPI (Backb+SASA+AA) were abbreviated as MpbPPI_BSSA and MpbPPI_BSA.
Figure 4
Figure 4
Comparison of mutant PPI structures from various mutant generation tools. An example of the structural differences between WT and mutant structures (PDB ID: 1AK4). The WT structure and mutant structures generated by FoldX, MODELLER and AlphaFold2 are shown in different colours for better identification. The mutant amino acid and its neighboring amino acids’ backbone and side chain are represented as sticks, with the mutant amino acid highlighted in surface style. The Cα RMSD values between the WT and mutant structures were 0 Å for FoldX, 0.3 Å for MODELLER and 3.8 Å for AlphaFold2.

Similar articles

Cited by

References

    1. Wang M, Cang Z, Wei GW. A topology-based network tree for the prediction of protein–protein binding affinity changes following mutation. Nat Mach Intell 2020;2(2):116–23. - PMC - PubMed
    1. Braun P, Gingras AC. History of protein–protein interactions: from egg-white to complex networks. Proteomics 2012;12(10):1478–98. - PubMed
    1. Feng T, Chen F, Kang Y, et al. HawkRank: a new scoring function for protein–protein docking based on weighted energy terms. J Chem 2017;9:1–15. - PMC - PubMed
    1. Porta-Pardo E, Garcia-Alonso L, Hrabe T, et al. A pan-cancer catalogue of cancer driver protein interaction interfaces. PLoS Comput Biol 2015;11(10):e1004518. - PMC - PubMed
    1. Barouch DH, Whitney JB, Moldt B, et al. Therapeutic efficacy of potent neutralizing HIV-1-specific monoclonal antibodies in SHIV-infected rhesus monkeys. Nature 2013;503(7475):224–8. - PMC - PubMed