Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 24;23(13):7033.
doi: 10.3390/ijms23137033.

Protein-Protein Interaction Prediction for Targeted Protein Degradation

Affiliations

Protein-Protein Interaction Prediction for Targeted Protein Degradation

Oliver Orasch et al. Int J Mol Sci. .

Abstract

Protein-protein interactions (PPIs) play a fundamental role in various biological functions; thus, detecting PPI sites is essential for understanding diseases and developing new drugs. PPI prediction is of particular relevance for the development of drugs employing targeted protein degradation, as their efficacy relies on the formation of a stable ternary complex involving two proteins. However, experimental methods to detect PPI sites are both costly and time-intensive. In recent years, machine learning-based methods have been developed as screening tools. While they are computationally more efficient than traditional docking methods and thus allow rapid execution, these tools have so far primarily been based on sequence information, and they are therefore limited in their ability to address spatial requirements. In addition, they have to date not been applied to targeted protein degradation. Here, we present a new deep learning architecture based on the concept of graph representation learning that can predict interaction sites and interactions of proteins based on their surface representations. We demonstrate that our model reaches state-of-the-art performance using AUROC scores on the established MaSIF dataset. We furthermore introduce a new dataset with more diverse protein interactions and show that our model generalizes well to this new data. These generalization capabilities allow our model to predict the PPIs relevant for targeted protein degradation, which we show by demonstrating the high accuracy of our model for PPI prediction on the available ternary complex data. Our results suggest that PPI prediction models can be a valuable tool for screening protein pairs while developing new drugs for targeted protein degradation.

Keywords: deep graph representation learning; protein–protein interactions; targeted protein degradation; ternary complex.

PubMed Disclaimer

Conflict of interest statement

All authors are affiliated with Celeris Therapeutics GmbH (Christopher Trummer and Noah Weber are C-level executives and hold equities), which has filed a patent application (PCT/EP2021/025372) based in part on this pipeline.

Figures

Figure 1
Figure 1
The relevance of protein–protein interactions for targeted protein degradation with bifunctional degraders. (A) To accomplish targeted protein degradation, the protein of interest (POI, yellow) is linked to the receptor (magenta) of an E3 ligase complex via a degrader molecule. Together, the POI, degrader, and E3 ligase form a ternary complex, allowing the passage of ubiquitin to the POI. (B) Ubiquitination of the POI leads to its degradation via the proteasomal system. The ubiquitin is recycled. (C) While the degrader molecule is instrumental for bringing the two proteins into proximity, cooperativity between the E3 ligase and the POI—i.e., strong protein–protein interactions—is essential for the formation of a stable ternary complex.
Figure 2
Figure 2
Model details. (A) Overall workflow for binding site prediction. The model outputs binary predictions of site activity for different points on the surface of proteins which are input via PDB files. The main processing steps are surface mesh generation followed by the computation of local chemo-geometric features, which are processed in a DGRL pipeline. (B) Overall workflow for interaction prediction of two proteins. Each protein is processed separatedly in a pipeline similar to that used for binding site prediction. The final processing step combines the learned features and produces a binary output. (C) Example of meshed protein surfaces of hemoglobin alpha and beta chains (PDB ID: 1A01, computed with EDTSurf [34,35]). Our strategy toward fines of the mesh was finding a sweet spot between rough meshes and fine meshes, where we would have a detailed representation of the surface while allowing for fast operations. Default parameters from the EDTSurf software were taken, where the probe radius is set to 1.4 as described in the EDTSurf documentation. (D) Details of chemo-geometric feature generation. Chemical and geometrical features are generated in separate streams. The geometrical feature generation (horizontal) consists of a learned embedding of curvature features estimated in the neighborhood of a point under consideration (see text for details). The chemical feature computation (vertical) includes learned embeddings with added distance-dependent and precomputed chemical features, which are aggregated within a neighborhood and processed with a single elementary Cluster-GCN layer [36,37] (see text for details). (E) Details of final DGRL processing. The chemo-geometric within a spherical neighborhood around a point under consideration influences the features at this point using learned weights generated from 3D positions and surface normals. The weighted features are processed with a multilayer perceptron (MLP), resulting in the final output.
Figure 3
Figure 3
Example evaluation on ternary complex data. (A) PPIs of proteins involved in the ternary complex composed by BTK, cIAP ubiquitin ligase and compound 17 (PDB ID: 6W7O). (B) Corresponding AUROC curve. See Appendix B for predictions on additional ternary complex data.
Figure 4
Figure 4
Histograms of AUROC of PPI prediction for the ternary complex dataset defined in Table 3, for models trained on different training sets. (A) Results for a model trained on the MaSIF dataset. (B) Results for a model trained on the Orthogonal dataset. (C) Results for a model trained on the super dataset, i.e., the union of the MaSIF and Orthogonal datasets. The increase of the mean AUROC indicates that the MaSIF dataset lacks diversity of protein structures; thus, it is not sufficient for the development of models in the context of targeted protein degradation. Adding the structures of the Orthogonal dataset leads to a strong improvement. See Table A5, Table A6 and Table A7 in Appendix B for the underlying results on each ternary complex, and Table A8 and Table A9 in Appendix B for exemplary confusion matrices using a threshold of 0.5.
Figure 5
Figure 5
Biochemical representation of predictions for the CRBN–BRD4 complex without the dBET55 degrader (PDB ID: 6BN8). Surface point clouds are superimposed to the original PDB structure. Blue and red points indicate interacting points prediction for chains B and C, respectively (AUROC: 0.987, inference performed using the model trained on the “super” dataset).

References

    1. Koshland D.E., Jr. The Key–Lock Theory and the Induced Fit Theory. Angew. Chem. Int. Ed. 1995;33:2375–2378. doi: 10.1002/anie.199423751. - DOI
    1. Hopkins A.L., Groom C.R. The druggable genome. Nat. Rev. Drug Discov. 2002;1:727–730. doi: 10.1038/nrd892. - DOI - PubMed
    1. Overington J.P., Al-Lazikani B., Hopkins A.L. How many drug targets are there? Nat. Rev. Drug Discov. 2006;5:993–996. doi: 10.1038/nrd2199. - DOI - PubMed
    1. Santos R., Ursu O., Gaulton A., Bento A.P., Donadi R.S., Bologa C.G., Karlsson A., Al-Lazikani B., Hersey A., Oprea T.I., et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 2016;16:19–34. doi: 10.1038/nrd.2016.230. - DOI - PMC - PubMed
    1. Lo T.W., Pickle C.S., Lin S., Ralston E.J., Gurling M., Schartner C.M., Bian Q., Doudna J.A., Meyer B.J. Precise and Heritable Genome Editing in Evolutionarily Diverse Nematodes Using TALENs and CRISPR/Cas9 to Engineer Insertions and Deletions. Genetics. 2013;195:331–348. doi: 10.1534/genetics.113.155382. - DOI - PMC - PubMed

LinkOut - more resources