Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 18;20(1):478.
doi: 10.1186/s12859-019-3058-0.

A novel protein descriptor for the prediction of drug binding sites

Affiliations

A novel protein descriptor for the prediction of drug binding sites

Mingjian Jiang et al. BMC Bioinformatics. .

Abstract

Background: Binding sites are the pockets of proteins that can bind drugs; the discovery of these pockets is a critical step in drug design. With the help of computers, protein pockets prediction can save manpower and financial resources.

Results: In this paper, a novel protein descriptor for the prediction of binding sites is proposed. Information on non-bonded interactions in the three-dimensional structure of a protein is captured by a combination of geometry-based and energy-based methods. Moreover, due to the rapid development of deep learning, all binding features are extracted to generate three-dimensional grids that are fed into a convolution neural network. Two datasets were introduced into the experiment. The sc-PDB dataset was used for descriptor extraction and binding site prediction, and the PDBbind dataset was used only for testing and verification of the generalization of the method. The comparison with previous methods shows that the proposed descriptor is effective in predicting the binding sites.

Conclusions: A new protein descriptor is proposed for the prediction of the drug binding sites of proteins. This method combines the three-dimensional structure of a protein and non-bonded interactions with small molecules to involve important factors influencing the formation of binding site. Analysis of the experiments indicates that the descriptor is robust for site prediction.

Keywords: Binding sites prediction; Deep learning; Molecule descriptor; Protein pockets.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Performance comparison of different channels
Fig. 2
Fig. 2
Comparison of different Minpts values for DBSCAN
Fig. 3
Fig. 3
5-fold crossover experiment for Top3 prediction. a fold1 b fold2 c fold3 d fold4 e fold5
Fig. 4
Fig. 4
5-fold crossover experiment for Top5 prediction. a fold1 b fold2 c fold3 d fold4 e fold5
Fig. 5
Fig. 5
Error sum of different methods for Top3 predictions
Fig. 6
Fig. 6
Error sum of different methods for Top5 predictions
Fig. 7
Fig. 7
Generalization effect on PDBbind of the model trained using scpdb dataset
Fig. 8
Fig. 8
A slightly modified version of LIGSITE. The voxels represent the solvent, the green dots are the protein atoms, and the white area is the protein contour. The red lines are the scanning lines in the x direction with a step of 1Å. When a scanning line experiences a protein-solvent-protein event, the voxel contained in the intermediate solvent undergoes a PSP event indicated by the purple voxels. In three-dimensional case, proteins are scanned in seven directions including x,y,z and four diagonal directions
Fig. 9
Fig. 9
The calculation process of the van der Waals force channel grid. The probe is placed in each grid voxel in turn, and the van der Waals potential between the probe and the protein is calculated as the voxel value
Fig. 10
Fig. 10
Determination of the positive samples. The black dot is the geometric center of the protein binding site, and a square (red block in the figure) with a side length of 20Å centered on it is set as the positive sample area; the total may include 4×4×4=64 sampling blocks, which are marked as positive samples
Fig. 11
Fig. 11
Training flow chart (4 channels)

References

    1. Lu Pinyi, Bevan David R., Leber Andrew, Hontecillas Raquel, Tubau-Juni Nuria, Bassaganya-Riera Josep. Accelerated Path to Cures. Cham: Springer International Publishing; 2018. Computer-Aided Drug Discovery; pp. 7–24.
    1. Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ. Computational protein–ligand docking and virtual drug screening with the autodock suite. Nat Protoc. 2016;11(5):905. - PMC - PubMed
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein data bank. Nucleic Acids Res. 2000;28(1):235–42. - PMC - PubMed
    1. Desaphy J, Bret G, Rognan D, Kellenberger E. sc-pdb: a 3d-database of ligandable binding sites-10 years on. Nucleic Acids Res. 2014;43(D1):399–404. - PMC - PubMed
    1. Liu Z, Li Y, Han L, Li J, Liu J, Zhao Z, Nie W, Liu Y, Wang R. Pdb-wide collection of binding data: current status of the PDBbind database. Bioinformatics. 2014;31(3):405–12. - PubMed

LinkOut - more resources