Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 8;13(1):65.
doi: 10.1186/s13321-021-00547-7.

PUResNet: prediction of protein-ligand binding sites using deep residual neural network

Affiliations

PUResNet: prediction of protein-ligand binding sites using deep residual neural network

Jeevan Kandel et al. J Cheminform. .

Abstract

Background: Predicting protein-ligand binding sites is a fundamental step in understanding the functional characteristics of proteins, which plays a vital role in elucidating different biological functions and is a crucial step in drug discovery. A protein exhibits its true nature after binding to its interacting molecule known as a ligand that binds only in the favorable binding site of the protein structure. Different computational methods exploiting the features of proteins have been developed to identify the binding sites in the protein structure, but none seems to provide promising results, and therefore, further investigation is required.

Results: In this study, we present a deep learning model PUResNet and a novel data cleaning process based on structural similarity for predicting protein-ligand binding sites. From the whole scPDB (an annotated database of druggable binding sites extracted from the Protein DataBank) database, 5020 protein structures were selected to address this problem, which were used to train PUResNet. With this, we achieved better and justifiable performance than the existing methods while evaluating two independent sets using distance, volume and proportion metrics.

Keywords: Binding site prediction; Convolutional neural network; Data cleaning; Deep residual network; Ligand binding sites.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flow diagram of data cleaning process
Fig. 2
Fig. 2
Flow diagram showing calculation of Tanimoto index
Fig. 3
Fig. 3
Model PUResNet architecture showing both encoder and decoder block with skip connections
Fig. 4
Fig. 4
Success rate plot for different DCC values combining all fold (kalasanty vs PUResNet)
Fig. 5
Fig. 5
Histogram of DVO values combining all folds (kalasanty vs PUResNet)
Fig. 6
Fig. 6
Success rate plot for different DCC values in Coach420 dataset (PUResNet vs kalasanty)
Fig. 7
Fig. 7
Histogram of DVO values for protein structure having DCC 4 Å in Coach420
Fig. 8
Fig. 8
Histogram of PLI values for protein structure having DCC 4 Å in Coach420
Fig. 9
Fig. 9
Success rate plot for different DCC values in BU48 dataset (PUResNet vs kalasanty)
Fig. 10
Fig. 10
Histogram of DVO values for protein structure having DCC 4 Å in BU48
Fig. 11
Fig. 11
Histogram of PLI values for protein structure having DCC 4 Å in BU48
Fig. 12
Fig. 12
Scatter plot showing DCC values of Coach420 dataset predicted by kalasanty and PUResNet with different views(I-V), View I showing DCC values 20 Å from PUResNet and kalasanty, View II showing DCC values 10 Å from PUResNet and kalasanty, View III showing DCC values 120 Å from PUResNet and 20 Å from kalasanty, View IV showing DCC values 120 Å from kalasanty and 20Å from PUResNet and View V showing DCC values 124 Å from kalasanty and PUResNet
Fig. 13
Fig. 13
Scatter plot showing DCC values of BU48 dataset predicted by kalasanty and PUResNet with different views (I–V), View I showing DCC values 20 Å from PUResNet and kalasanty, View II showing DCC values 10 Å from PUResNet and kalasanty, View III showing DCC values 120 Å from PUResNet and 20 Å from kalasanty, View IV showing DCC values 120 Å from kalasanty and 20 Å from PUResNet and View V showing DCC values 120 Å from kalasanty and PUResNet
Fig. 14
Fig. 14
Protein strucutre ( 2zhz, 3h39, 3gpl, 7est, 2w1a, 1a4k) from Coach420, showing predicted binding site by kalasanty(Blue region) and PUResNet (Red Region)
Fig. 15
Fig. 15
Bound and Unbound pair ((1a6u,1a6w), (1gcg,1gca)), showing predicted binding site by kalasanty(Blue region) and PUResNet (Red Region)

References

    1. Nelson DL. Lehninger principles of biochemistry. 4. New York: W.H. Freeman; 2005.
    1. Le Guilloux V, Schmidtke P, Tuffery P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinform. 2009;10(1):168. doi: 10.1186/1471-2105-10-168. - DOI - PMC - PubMed
    1. Liang J, Woodward C, Edelsbrunner H. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design. Protein Sci. 1998;7(9):1884–1897. doi: 10.1002/pro.5560070905. - DOI - PMC - PubMed
    1. Hendlich M, Rippmann F, Barnickel G. Ligsite: automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model. 1997;15(6):359–363. doi: 10.1016/S1093-3263(98)00002-3. - DOI - PubMed
    1. Levitt DG, Banaszak LJ. Pocket: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. J Mol Graph. 1992;10(4):229–234. doi: 10.1016/0263-7855(92)80074-N. - DOI - PubMed

LinkOut - more resources