Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 28;11(1):13369.
doi: 10.1038/s41598-021-92825-5.

Exploring the chemical space of protein-protein interaction inhibitors through machine learning

Affiliations

Exploring the chemical space of protein-protein interaction inhibitors through machine learning

Jiwon Choi et al. Sci Rep. .

Abstract

Although protein-protein interactions (PPIs) have emerged as the basis of potential new therapeutic approaches, targeting intracellular PPIs with small molecule inhibitors is conventionally considered highly challenging. Driven by increasing research efforts, success rates have increased significantly in recent years. In this study, we analyze the physicochemical properties of 9351 non-redundant inhibitors present in the iPPI-DB and TIMBAL databases to define a computational model for active compounds acting against PPI targets. Principle component analysis (PCA) and k-means clustering were used to identify plausible PPI targets in regions of interest in the active group in the chemical space between active and inactive iPPI compounds. Notably, the uniquely defined active group exhibited distinct differences in activity compared with other active compounds. These results demonstrate that active compounds with regions of interest in the chemical space may be expected to provide insights into potential PPI inhibitors for particular protein targets.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Distributions of compounds for target proteins of iPPI datasets. The colored histogram shows the frequency distribution against numbers of known compounds for each PPI target. Panels (A) and (B) indicate results for the active and inactive groups in the iPPI datasets, respectively.
Figure 2
Figure 2
Physicochemical profile of compounds from iPPI datasets. (AG) Chemical properties of the compounds from the iPPI datasets are compared using the histogram for the seven molecular descriptors. The dotted lines represent mean values, and the histogram bars of the active and inactive group are colored red and light green, respectively, whereas the dark green bar represents the overlap region. (H) Distribution of the chemical space of the compounds in the iPPI datasets according to principal component analysis. All histograms and scatter plots were generated using the R software.
Figure 3
Figure 3
Visual representation of the chemical space of Bcl-2 and MDM2 dataset. Principal component analysis (PCA)-based clustering representing the comparison of the chemical space on active/inactive datasets in the Bcl-2 and MDM2 datasets. (A,B) Distribution of the chemical space of the compounds in the Bcl-2 and MDM2 dataset according to principal component analysis. The loading plot vectors are represented by arrows for each physicochemical property. (C,D) Data points are color-coded by cluster of molecules. The magenta and blue dots correspond to active compounds in the clusters 1 and 2, respectively.
Figure 4
Figure 4
PCA plot of BCL-2 dataset. The visual representation was generated with principal component analysis of seven drug-like physicochemical properties. The loading plot vectors are represented by arrows for each physicochemical property. The blue dots (A), red dots (B), cyan dots (C), magenta dots (D), and green dots (E) represent Class 1, 2, 3, 4, and 5, respectively.
Figure 5
Figure 5
Distribution of active values and Glide scores of Cluster 1 and 2 active datasets. Histograms for (A) active values, (B) SP GlideScore distributions. The dotted lines represent mean values and the histogram bar of the Cluster 1 active set, and the Cluster 2 active set are colored magenta and blue, respectively, whereas the dark blue represents their overlap region. Boxplots for (C) activity values, (D) SP GlideScore distributions of active dataset for each the Cluster 1 and 2. The magenta and blue colors correspond to the active sets of Clusters 1 and 2, respectively.

Similar articles

Cited by

References

    1. Kuenemann MA, et al. Imbalance in chemical space: How to facilitate the identification of protein–protein interaction inhibitors. Sci. Rep. 2016;6(1):1–17. doi: 10.1038/srep23815. - DOI - PMC - PubMed
    1. Cunningham AD, Qvit N, Mochly-Rosen D. Peptides and peptidomimetics as regulators of protein–protein interactions. Curr. Opin. Struct. Biol. 2017;44:59–66. doi: 10.1016/j.sbi.2016.12.009. - DOI - PMC - PubMed
    1. Zhang G, Andersen J, Gerona-Navarro G. Peptidomimetics targeting protein–protein interactions for therapeutic development. Protein Pept. Lett. 2018;25(12):1076–1089. doi: 10.2174/0929866525666181101100842. - DOI - PubMed
    1. Safari-Alighiarloo N, et al. Protein–protein interaction networks (PPI) and complex diseases. Gastroenterol. Hepatol. Bed Bench. 2014;7(1):17. - PMC - PubMed
    1. Guo W, Wisniewski JA, Ji H. Hot spot-based design of small-molecule inhibitors for protein–protein interactions. Bioorg. Med. Chem. Lett. 2014;24(11):2546–2554. doi: 10.1016/j.bmcl.2014.03.095. - DOI - PubMed

Publication types