Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment
- PMID: 29190087
- DOI: 10.1021/acs.jcim.7b00309
Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment
Abstract
Molecular docking, scoring, and virtual screening play an increasingly important role in computer-aided drug discovery. Scoring functions (SFs) are typically employed to predict the binding conformation (docking task), binding affinity (scoring task), and binary activity level (screening task) of ligands against a critical protein target in a disease's pathway. In most molecular docking software packages available today, a generic binding affinity-based (BA-based) SF is invoked for all three tasks to solve three different, but related, prediction problems. The limited predictive accuracies of such SFs in these three tasks has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we develop BT-Score, an ensemble machine-learning (ML) SF of boosted decision trees and thousands of predictive descriptors to estimate BA. BT-Score reproduced BA of out-of-sample test complexes with correlation of 0.825. Even with this high accuracy in the scoring task, we demonstrate that the docking and screening performance of BT-Score and other BA-based SFs is far from ideal. This has motivated us to build two task-specific ML SFs for the docking and screening problems. We propose BT-Dock, a boosted-tree ensemble model trained on a large number of native and computer-generated ligand conformations and optimized to predict binding poses explicitly. This model has shown an average improvement of 25% over its BA-based counterparts in different ligand pose prediction scenarios. Similar improvement has also been obtained by our screening-based SF, BT-Screen, which directly models the ligand activity labeling task as a classification problem. BT-Screen is trained on thousands of active and inactive protein-ligand complexes to optimize it for finding real actives from databases of ligands not seen in its training set. In addition to the three task-specific SFs, we propose a novel multi-task deep neural network (MT-Net) that is trained on data from the three tasks to simultaneously predict binding poses, affinities, and activity levels. We show that the performance of MT-Net is superior to conventional SFs and on a par with or better than models based on single-task neural networks.
Similar articles
-
Boosted neural networks scoring functions for accurate ligand docking and ranking.J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4. J Bioinform Comput Biol. 2018. PMID: 29495922
-
BgN-Score and BsN-Score: bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes.BMC Bioinformatics. 2015;16 Suppl 4(Suppl 4):S8. doi: 10.1186/1471-2105-16-S4-S8. Epub 2015 Feb 23. BMC Bioinformatics. 2015. PMID: 25734685 Free PMC article.
-
A Comparative Assessment of Predictive Accuracies of Conventional and Machine Learning Scoring Functions for Protein-Ligand Binding Affinity Prediction.IEEE/ACM Trans Comput Biol Bioinform. 2015 Mar-Apr;12(2):335-47. doi: 10.1109/TCBB.2014.2351824. IEEE/ACM Trans Comput Biol Bioinform. 2015. PMID: 26357221
-
An Overview of Scoring Functions Used for Protein-Ligand Interactions in Molecular Docking.Interdiscip Sci. 2019 Jun;11(2):320-328. doi: 10.1007/s12539-019-00327-w. Epub 2019 Mar 15. Interdiscip Sci. 2019. PMID: 30877639 Review.
-
Predicting Protein-Ligand Docking Structure with Graph Neural Network.J Chem Inf Model. 2022 Jun 27;62(12):2923-2932. doi: 10.1021/acs.jcim.2c00127. Epub 2022 Jun 14. J Chem Inf Model. 2022. PMID: 35699430 Free PMC article. Review.
Cited by
-
GNINA 1.0: molecular docking with deep learning.J Cheminform. 2021 Jun 9;13(1):43. doi: 10.1186/s13321-021-00522-2. J Cheminform. 2021. PMID: 34108002 Free PMC article.
-
Nonparametric chemical descriptors for the calculation of ligand-biopolymer affinities with machine-learning scoring functions.J Comput Aided Mol Des. 2019 Nov;33(11):943-953. doi: 10.1007/s10822-019-00248-2. Epub 2019 Nov 14. J Comput Aided Mol Des. 2019. PMID: 31728812
-
Integrated Molecular Modeling and Machine Learning for Drug Design.J Chem Theory Comput. 2023 Nov 14;19(21):7478-7495. doi: 10.1021/acs.jctc.3c00814. Epub 2023 Oct 26. J Chem Theory Comput. 2023. PMID: 37883810 Free PMC article. Review.
-
PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications.Sci Data. 2024 Feb 9;11(1):180. doi: 10.1038/s41597-023-02872-y. Sci Data. 2024. PMID: 38336857 Free PMC article.
-
Deep learning and virtual drug screening.Future Med Chem. 2018 Nov;10(21):2557-2567. doi: 10.4155/fmc-2018-0314. Epub 2018 Oct 5. Future Med Chem. 2018. PMID: 30288997 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources