Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 13;23(3):bbac051.
doi: 10.1093/bib/bbac051.

Improving protein-ligand docking and screening accuracies by incorporating a scoring function correction term

Affiliations

Improving protein-ligand docking and screening accuracies by incorporating a scoring function correction term

Liangzhen Zheng et al. Brief Bioinform. .

Abstract

Scoring functions are important components in molecular docking for structure-based drug discovery. Traditional scoring functions, generally empirical- or force field-based, are robust and have proven to be useful for identifying hits and lead optimizations. Although multiple highly accurate deep learning- or machine learning-based scoring functions have been developed, their direct applications for docking and screening are limited. We describe a novel strategy to develop a reliable protein-ligand scoring function by augmenting the traditional scoring function Vina score using a correction term (OnionNet-SFCT). The correction term is developed based on an AdaBoost random forest model, utilizing multiple layers of contacts formed between protein residues and ligand atoms. In addition to the Vina score, the model considerably enhances the AutoDock Vina prediction abilities for docking and screening tasks based on different benchmarks (such as cross-docking dataset, CASF-2016, DUD-E and DUD-AD). Furthermore, our model could be combined with multiple docking applications to increase pose selection accuracies and screening abilities, indicating its wide usage for structure-based drug discoveries. Furthermore, in a reverse practice, the combined scoring strategy successfully identified multiple known receptors of a plant hormone. To summarize, the results show that the combination of data-driven model (OnionNet-SFCT) and empirical scoring function (Vina score) is a good scoring strategy that could be useful for structure-based drug discoveries and potentially target fishing in future.

Keywords: machine learning; molecular docking; reversal virtual screening; scoring function; virtual screening.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The performance of different scoring models (OnionNet-SFCT only, Vina score only and OnionNet-SFCT+Vina) on redocking and cross-docking tasks. (A–B) the success rate of different scoring methods with ligand binding pocket was predefined (A) or unknown (setting receptor geometry center as the docking box center) (B). (C) The RMSD values of the top-ranking poses for each protein-ligand complex with (y-axis) or without (x-axis) OnionNet-SFCT term.
Figure 2
Figure 2
The performance of OnionNet-SFCT along with different scoring functions on redocking and cross-docking tasks. (A) The change of the average top-ranking pose RMSD values after rescoring with OnionNet-SFCT combined scoring functions. (B) The change of the success rate of the top-ranking poses after rescoring with OnionNet-SFCT combined scoring functions.
Figure 3
Figure 3
The docking power (A) and RMSD-energy correlations (B) of OnionNet-SFCT+Vina scoring strategy.
Figure 4
Figure 4
The screening power of the OnionNet-SFCT+Vina scoring strategy. (A) The enrichment factor (1%) of different scoring functions. (B) The success rate for identifying the best ligand in the top-ranking poses for different scoring functions. The confident intervals at 95% are indicated by black lines.
Figure 5
Figure 5
The per-target enrichment factor comparison with or without rescoring by OnionNet-SFCT+Vina or OnionNet-SFCT+Gnina on the two benchmarks DUD-E (A and B) and DUD-AD (C and D). For targets whose enrichment factors are largely increased (up to 30) with rescoring, the target names are labelled.
Figure 6
Figure 6
Comparison of binding modes in crystal structure (gray) and predicted by OnionNet-SFCT (green). The native protein structure is gray and binding site residue Phe65 is indicated by red sticks. The predicted protein structure is colored by the atom-level pocket probability scores (ranging from 0.0 to 1.0) where blue indicates low probability score and red indicates high probability score.
Figure 7
Figure 7
The feature importance of the OnionNet-SFCT model. The x-axis represents the contact distance cutoff, whereas the y-axis represents the element types of the ligand molecules. Each panel displays the importance of the interactions between different ligand atoms with a specific residue type. The color scales indicate the feature importance, with yellow color suggests the highest importance whereas the blue color indicates the lowest importance.

References

    1. Bentham Science Publisher BSP . Scoring Functions for Protein-Ligand Docking. Curr Protein Pept Sci 2006;7:407–20. - PubMed
    1. Irwin JJ, Shoichet BK. Docking Screens for Novel Ligands Conferring New Biology. J Med Chem 2016;59:4103–20. - PMC - PubMed
    1. Huang N, Shoichet BK, Irwin JJ. Benchmarking Sets for Molecular Docking. J Med Chem 2006;49:6789–801. - PMC - PubMed
    1. Sousa SF, Fernandes PA, Ramos MJ. Protein-ligand docking: Current status and future challenges. Proteins Struct Funct Bioinforma 2006;65:15–26. - PubMed
    1. Li Y, Su M, Liu Z, et al. . Assessing protein–ligand interaction scoring functions with the CASF-2013 benchmark. Nat Protoc 2018;13:666–80. - PubMed

Publication types

LinkOut - more resources