Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 29;12(8):1053.
doi: 10.3390/biom12081053.

GraphSite: Ligand Binding Site Classification with Deep Graph Learning

Affiliations

GraphSite: Ligand Binding Site Classification with Deep Graph Learning

Wentao Shi et al. Biomolecules. .

Abstract

The binding of small organic molecules to protein targets is fundamental to a wide array of cellular functions. It is also routinely exploited to develop new therapeutic strategies against a variety of diseases. On that account, the ability to effectively detect and classify ligand binding sites in proteins is of paramount importance to modern structure-based drug discovery. These complex and non-trivial tasks require sophisticated algorithms from the field of artificial intelligence to achieve a high prediction accuracy. In this communication, we describe GraphSite, a deep learning-based method utilizing a graph representation of local protein structures and a state-of-the-art graph neural network to classify ligand binding sites. Using neural weighted message passing layers to effectively capture the structural, physicochemical, and evolutionary characteristics of binding pockets mitigates model overfitting and improves the classification accuracy. Indeed, comprehensive cross-validation benchmarks against a large dataset of binding pockets belonging to 14 diverse functional classes demonstrate that GraphSite yields the class-weighted F1-score of 81.7%, outperforming other approaches such as molecular docking and binding site matching. Further, it also generalizes well to unseen data with the F1-score of 70.7%, which is the expected performance in real-world applications. We also discuss new directions to improve and extend GraphSite in the future.

Keywords: deep learning; graph neural network; ligand binding sites; structure-based drug discovery.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interests.

Figures

Figure 1
Figure 1
Example of the graph representation of a binding site. (A) The structure of a binding pocket for ADP in DnaA regulatory inactivator Had from E. coli (PDB-ID: 5x06). (B) The graph representation of four residues, W20, R174, E14, and R53, selected from (A).
Figure 2
Figure 2
Architecture of the pocket classifier in GraphSite. (A) The input graph represents a binding site. (B) A neural network computing the weight for message passing from the edge attributes of the input graph. (C) Message passing layers of the jumping knowledge network. (D) A global pooling layer implementing the Set2Set model. (E) Fully connected layers generate the final classification results.
Figure 3
Figure 3
Confusion matrix for classification with GraphSite on the benchmarking dataset. Each row of the confusion matrix is normalized. Numbers on the diagonal correspond to the recall of each class, while other numbers indicate the fraction of misclassified pockets.
Figure 4
Figure 4
Structure alignments between misclassified pockets and those belonging to the predicted class. (A) PIPES (orange sticks) binding site in CENP-E (purple surface) and ATP (cyan sticks) binding site in FGAM synthase II (yellow surface). (B) MES (orange sticks) binding site in zitR (purple surface) and ATP (cyan sticks) binding site in FGAM synthase II (yellow surface). (C) Imatinib (orange sticks) binding site in ANC-AS (purple surface) and ATP (cyan sticks) binding site in FGAM synthase II (yellow surface). (D) (3R)-3-hydroxy-2,4-dioxopentyl dihydrogen phosphate (orange sticks) binding site in LsrF (purple surface) and arginine (cyan sticks) binding site in AT (yellow surface). (E) Colchicine (orange sticks) binding site in BRD4 (purple surface) and ATP (cyan sticks) binding site in FGAM synthase II (yellow surface). (F) Tromethamine (orange sticks) binding site in MAT (purple surface) and di(hydroxyethyl)ether (cyan sticks) binding site in BtR318A (yellow surface).
Figure 5
Figure 5
Distribution of the classification confidence for benchmarking and negative datasets. The classification confidence corresponds to a probability of the top-ranked ligand binding class predicted by GraphSite.
Figure 6
Figure 6
Architecture of Siamese-GraphSite. This model requires a pair of graph-structured data as the input for two embedding networks sharing their parameters and utilizes the contrastive loss function.
Figure 7
Figure 7
t-SNE visualization of embeddings generated by Siamese-GraphSite. Each dot represents one pocket colored by the cluster assignment.

Similar articles

Cited by

References

    1. Armstrong J.D., Hubbard R.E., Farrell T., Maiguashca B., editors. Structure-Based Drug Discovery: An Overview. The Royal Society of Chemistry; Cambridge, UK: 2006.
    1. Roche D.B., Brackenridge D.A., McGuffin L.J. Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods. Int. J. Mol. Sci. 2015;16:29829–29842. doi: 10.3390/ijms161226202. - DOI - PMC - PubMed
    1. Vos T., Lim S.S., Abbafati C., Abbas K.M., Abbasi M., Abbasifard M., Abbasi-Kangevari M., Abbastabar H., Abd-Allah F., Abdelalim A., et al. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:1204–1222. doi: 10.1016/S0140-6736(20)30925-9. - DOI - PMC - PubMed
    1. Govindaraj R.G., Naderi M., Singha M., Lemoine J., Brylinski M. Large-scale computational drug repositioning to find treatments for rare diseases. npj Syst. Biol. Appl. 2018;4:13. doi: 10.1038/s41540-018-0050-7. - DOI - PMC - PubMed
    1. Hendlich M., Rippmann F., Barnickel G. LIGSITE: Automatic and efficient detection of potential small molecule-binding sites in proteins. J. Mol. Graph. Model. 1997;15:359–363. doi: 10.1016/S1093-3263(98)00002-3. - DOI - PubMed

Publication types

LinkOut - more resources