Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Nov-Dec;5(6):405-424.
doi: 10.1002/wcms.1225. Epub 2015 Aug 28.

Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening

Affiliations
Review

Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening

Qurrat Ul Ain et al. Wiley Interdiscip Rev Comput Mol Sci. 2015 Nov-Dec.

Abstract

Docking tools to predict whether and how a small molecule binds to a target can be applied if a structural model of such target is available. The reliability of docking depends, however, on the accuracy of the adopted scoring function (SF). Despite intense research over the years, improving the accuracy of SFs for structure-based binding affinity prediction or virtual screening has proven to be a challenging task for any class of method. New SFs based on modern machine-learning regression models, which do not impose a predetermined functional form and thus are able to exploit effectively much larger amounts of experimental data, have recently been introduced. These machine-learning SFs have been shown to outperform a wide range of classical SFs at both binding affinity prediction and virtual screening. The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert-selected structural features can be strongly improved by a machine-learning approach based on nonlinear regression allied with comprehensive data-driven feature selection. Furthermore, the performance of classical SFs does not grow with larger training datasets and hence this performance gap is expected to widen as more training data becomes available in the future. Other topics covered in this review include predicting the reliability of a SF on a particular target class, generating synthetic data to improve predictive performance and modeling guidelines for SF development. WIREs Comput Mol Sci 2015, 5:405-424. doi: 10.1002/wcms.1225 For further resources related to this article, please visit the WIREs website.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Examples of force‐field, knowledge‐based, empirical, and machine‐learning scoring functions (SFs). The first three types, collectively termed classical SFs, are distinguished by the type of structural descriptors employed. However, from a mathematical perspective, all classical SFs assume an additive functional form. By contrast, nonparametric machine‐learning SFs do not make assumptions about the form of the functional. Instead, the functional form is inferred from training data in an unbiased manner. As a result, classical and machine‐learning SFs behave very differently in practice.20
Figure 2
Figure 2
Criteria to select data to build and validate scoring functions (SFs). Protein‐ligand complexes can be selected by their quality, protein‐family membership as well as type of structural and binding data depending on intended docking application and modeling strategy. Classical SFs typically employ a few hundred x‐ray crystal structures of the highest quality along with their binding constants to score complexes with proteins from any family. In contrast, data selection for machine‐learning SFs is much more varied, with the largest training data volumes leading to the best performances.
Figure 3
Figure 3
Workflow to train and validate a scoring function (SF). Feature Selection (FS) can be data‐driven or expert‐based (for simplicity, we are not representing embedded FS that would take place at the model training stage). A range of machine‐learning regression or classification models can be used for training, whereas linear regression has been used with classical SFs. Model selection has ranged from taking the best model on the training set to selecting that with the best cross‐validated performance. Metrics for model selection and performance evaluation depend on the application.
Figure 4
Figure 4
Blind test showing how test set performance (Rp) grows with more training data when using random forest (models 3 and 4), but stagnates with multiple linear regression (model 2). Model 1 is AutoDock Vina acting as a baseline for performance.

References

    1. Schneider G. Virtual screening: an endless staircase? Nat Rev Drug Discov 2010, 9:273–276. - PubMed
    1. Vasudevan SR, Churchill GC. Mining free compound databases to identify candidates selected by virtual screening. Expert Opin Drug Discov 2009, 4:901–906. - PubMed
    1. Villoutreix BO, Renault N, Lagorce D, Sperandio O, Montes M, Miteva MA. Free resources to assist structure‐based virtual ligand screening experiments. Curr Protein Pept Sci 2007, 8:381–411. - PubMed
    1. Xing L, McDonald JJ, Kolodziej SA, Kurumbail RG, Williams JM, Warren CJ, O'Neal JM, Skepner JE, Roberds SL. Discovery of potent inhibitors of soluble epoxide hydrolase by combinatorial library design and structure‐based virtual screening. J Med Chem 2011, 54:1211–1222. - PubMed
    1. Hermann JC, Marti‐Arbona R, Fedorov AA, Fedorov E, Almo SC, Shoichet BK, Raushel FM. Structure‐based activity prediction for an enzyme of unknown function. Nature 2007, 448:775–779. - PMC - PubMed