Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 May 23;62(10):2301-2315.
doi: 10.1021/acs.jcim.1c01510. Epub 2022 Apr 21.

Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel-Ligand Docking

Affiliations
Review

Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel-Ligand Docking

Heesung Shim et al. J Chem Inf Model. .

Abstract

The identification of promising lead compounds showing pharmacological activities toward a biological target is essential in early stage drug discovery. With the recent increase in available small-molecule databases, virtual high-throughput screening using physics-based molecular docking has emerged as an essential tool in assisting fast and cost-efficient lead discovery and optimization. However, the best scored docking poses are often suboptimal, resulting in incorrect screening and chemical property calculation. We address the pose classification problem by leveraging data-driven machine learning approaches to identify correct docking poses from AutoDock Vina and Glide screens. To enable effective classification of docking poses, we present two convolutional neural network approaches: a three-dimensional convolutional neural network (3D-CNN) and an attention-based point cloud network (PCN) trained on the PDBbind refined set. We demonstrate the effectiveness of our proposed classifiers on multiple evaluation data sets including the standard PDBbind CASF-2016 benchmark data set and various compound libraries with structurally different protein targets including an ion channel data set extracted from Protein Data Bank (PDB) and an in-house KCa3.1 inhibitor data set. Our experiments show that excluding false positive docking poses using the proposed classifiers improves virtual high-throughput screening to identify novel molecules against each target protein compared to the initial screen based on the docking scores.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Example of 20 Vina docking scores with their RMSD values. They were calculated between each pose and the crystal structure of the 3R17 ligand (human carbonic anhydrase, hCA) from the PDBbind 2019 refined set.
Figure 2
Figure 2
(Left) Closed state of the KCa3.1 channel (PDB ID: 6CNM) and binding site of inhibitors (red box). (Right) General structures of KCa3.1 channel triarylmethane and cyclohexadiene based inhibitors. The four channel alpha subunits are rainbow colored; the channel associated with calmodulin is shown in yellow.
Figure 3
Figure 3
Overall network architecture of the proposed 3D-CNN and PCN. The input for the networks is 3D atomic structures with their features (3D pose representation). The PCN uses the input data directly, whereas the 3D-CNN uses their voxelized data. The optional interaction features are concatenated with one of the fully connected layer activations.
Figure 4
Figure 4
ROC curves of the pose classification models (3DCNN, 3DCNN_i, 3DCNN_a, 3DCNN_ia, PCN, PCN_a, PCN_ia, and RF_i) on CASF-2016.
Figure 5
Figure 5
ROC curves of the pose classification models (3DCNN, 3DCNN_i, 3DCNN_a, 3DCNN_ia, PCN, PCN_a, PCN_ia, and RF_i) on PDB ion channel data set.
Figure 6
Figure 6
Pearson correlations between docking scores and binding affinities using seven pose classification models for the PDB ion channel data set. (a) Vina scores of top-ranked poses, (b) average Vina scores of all docking poses, and (c) average Vina scores across correct poses filtered by the proposed pose classifier (3D-CNN_ia).
Figure 7
Figure 7
Pearson correlations between the docking scores and binding affinities using seven pose classification models for the KCa3.1 channel inhibitor data set (left, Vina; right, Glide). (a) Vina scores of top-ranked poses, (b) average Vina scores of all docking poses, (c) average Vina scores across correct poses filtered by the proposed pose classifier (3D-CNN_i), (d) Glide scores of top-ranked poses, (e) average Glide scores of all docking poses, and (f) average Glide scores across correct poses filtered by the proposed pose classifier (3D-CNN_a).
Figure 8
Figure 8
Example of correct and incorrect docking poses in the PDB ion channel data set with pose classification results (4TNW, top; 4XDK, bottom). Each docking pose includes RMSD, Vina score, and model confidence of one of our pose classifiers (3D-CNN_ia), respectively. The model confidence can be [0, 1], where a number close to 0 indicates incorrect.
Figure 9
Figure 9
Pearson correlations between binding affinity and docking scores of the top 10, 20, 30, and 40 ranked compounds based on the confidence scores of our pose classifier models on the KCa3.1 channel inhibitor data set. 3D-CNN_i with Vina docking poses (left) and 3D-CNN_a with Glide docking poses (right).
Figure 10
Figure 10
pIC50 of the top 10 ranked compounds in the KCa3.1 channel inhibitor data set without (left) and with the pose classifier (3D-CNN_a,right). The orange colors indicate strong binders (pIC50 ≥ 7). The yellow colors indicate compounds with 6 ≤ pIC50 < 7.

References

    1. Lau E. Y.; Negrete O. A.; Bennett W. F. D.; Bennion B. J.; Borucki M.; Bourguet F.; Epstein A.; Franco M.; Harmon B.; He S.; Jones D.; Kim H.; Kirshner D.; Lao V.; Lo J.; McLoughlin K.; Mosesso R.; Murugesh D. K.; Saada E. A.; Segelke B.; Stefan M. A.; Stevenson G. A.; Torres M. W.; Weilhammer D. R.; Wong S.; Yang Y.; Zemla A.; Zhang X.; Zhu F.; Allen J. E.; Lightstone F. C. Discovery of Small-Molecule Inhibitors of SARS-CoV-2 Proteins Using a Computational and Experimental Pipeline. Front Mol. Biosci 2021, 8, 678701. 10.3389/fmolb.2021.678701. - DOI - PMC - PubMed
    1. Wrapp D.; Wang N.; Corbett K. S.; Goldsmith J. A.; Hsieh C. L.; Abiona O.; Graham B. S.; McLellan J. S. Cryo-EM Structure of the 2019-nCoV Spike in the Prefusion Conformation. Science 2020, 367, 1260. 10.1126/science.abb2507. - DOI - PMC - PubMed
    1. Lan J.; Ge J.; Yu J.; Shan S.; Zhou H.; Fan S.; Zhang Q.; Shi X.; Wang Q.; Zhang L.; Wang X. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature 2020, 581 (7807), 215–220. 10.1038/s41586-020-2180-5. - DOI - PubMed
    1. Yin S.; Biedermannova L.; Vondrasek J.; Dokholyan N. V. MedusaScore: an accurate force field-based scoring function for virtual drug screening. J. Chem. Inf Model 2008, 48 (8), 1656–62. 10.1021/ci8001167. - DOI - PMC - PubMed
    1. Guedes I. A.; Pereira F. S. S.; Dardenne L. E. Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges. Front Pharmacol 2018, 9, 1089. 10.3389/fphar.2018.01089. - DOI - PMC - PubMed

Publication types