Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 9;37(17):2580-2588.
doi: 10.1093/bioinformatics/btab154.

Protein interaction interface region prediction by geometric deep learning

Affiliations

Protein interaction interface region prediction by geometric deep learning

Bowen Dai et al. Bioinformatics. .

Abstract

Motivation: Protein-protein interactions drive wide-ranging molecular processes, and characterizing at the atomic level how proteins interact (beyond just the fact that they interact) can provide key insights into understanding and controlling this machinery. Unfortunately, experimental determination of three-dimensional protein complex structures remains difficult and does not scale to the increasingly large sets of proteins whose interactions are of interest. Computational methods are thus required to meet the demands of large-scale, high-throughput prediction of how proteins interact, but unfortunately, both physical modeling and machine learning methods suffer from poor precision and/or recall.

Results: In order to improve performance in predicting protein interaction interfaces, we leverage the best properties of both data- and physics-driven methods to develop a unified Geometric Deep Neural Network, 'PInet' (Protein Interface Network). PInet consumes pairs of point clouds encoding the structures of two partner proteins, in order to predict their structural regions mediating interaction. To make such predictions, PInet learns and utilizes models capturing both geometrical and physicochemical molecular surface complementarity. In application to a set of benchmarks, PInet simultaneously predicts the interface regions on both interacting proteins, achieving performance equivalent to or even much better than the state-of-the-art predictor for each dataset. Furthermore, since PInet is based on joint segmentation of a representation of a protein surfaces, its predictions are meaningful in terms of the underlying physical complementarity driving molecular recognition.

Availability and implementation: PInet scripts and models are available at https://github.com/FTD007/PInet.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Different proteins can recognize the same protein partner in very different ways, as shown here for hen egg lysozyme (HEL; gray surface, 3LZT) and different antibodies (colored cartoon; purple: 1BVK, blue: 1DQJ, red: 1MLC and orange: 2I25). Partner-independent predictions seek to label, in general, what parts of one protein might be recognized by unknown other proteins. In this example, even with just these few example antibodies, much of the HEL surface is recognized by one antibody or another, so partner-independent prediction of binding region would largely cover the surface. Thus when information about particular partners is available, partner-specific predictions can be beneficial, providing a more specific characterization of recognition by separately localizing each partner. (Color version of this figure is available at Bioinformatics online.)
Fig. 2.
Fig. 2.
Overview of our Protein Interface Network (PInet) approach for predicting interaction regions on pairs of proteins. PInet consumes two 5 dimensional point clouds representing geometry and physicochemical properties of each protein surface, and performs a semantic segmentation on all points from both point clouds simultaneously. It first processes each point cloud separately. For each protein, a Spatial Transformation Network renders the surface point clouds invariant to rigid-body transformations. Then a multi-layer perceptron (MLP) extracts local surface features. These local surface features are then aggregated into a global protein feature vector. With each protein thus processed, the protein local surface features and global protein features from both proteins are concatenated in order to be segmented by another MLP. The trainable weights for canonical transformation, local and global feature extraction are shared for the two proteins
Fig. 3.
Fig. 3.
Binding interface prediction for one protein (hen egg lysozyme) with four different antibodies. (Top four rows) PInet predictions, from two different viewpoints. The heatmap shows predicted probability of being in the interface, with darker red for higher probability and darker blue for lower. (Bottom row) Discotope prediction for 3LZT, again with a heatmap showing predicted probability (darker red higher, darker blue lower). (Color version of this figure is available at Bioinformatics online.)
Fig. 4.
Fig. 4.
Segmentation visualization for three example (median PInet prediction performance) Enzyme-Partner pairs from DBD5: Enzyme-Inhibitor 3VLB (top), Enzyme-Substrate 4H03 (middle) and Enzyme complex with a regulatory or accessory chain 1GLA (bottom). Heatmaps for the partner indicate the predicted probability of being in the interface, with darker red for higher probability and darker blue for lower. (Color version of this figure is available at Bioinformatics online.)
Fig. 5.
Fig. 5.
Segmentation visualization for 3 example Ab-Ag pairs: best 4JR9, top 25% 3LIZ and 50% 3RAJ (according to AUC-PR). For these three pairs, precision ranges from 13% to 23% and recall ranges from 65% to 93%. Green structures are antibodies and gray structures are antigens, while heatmaps for the partner indicate the predicted probability of being in the interface, with darker red for higher probability and darker blue for lower. (Color version of this figure is available at Bioinformatics online.)

References

    1. Afsar Minhas F. u A. et al. (2014) Pairpred: partner-specific prediction of interacting residues from sequence and structure. Proteins Struct. Funct. Bioinf., 82, 1142–1155. - PMC - PubMed
    1. Bahdanau D. et al. (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    1. Baker N.A. et al. (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl. Acad. Sci. USA, 98, 10037–10041. - PMC - PubMed
    1. Berman H.M. et al. (2002) The protein data bank. Acta Crystallogr. D Biol. Crystallogr., 58, 899–907. - PubMed
    1. Briney B. et al. (2019) Commonality despite exceptional diversity in the baseline human antibody repertoire. Nature, 566, 393–397. - PMC - PubMed