Flexible CDOCKER: Hybrid Searching Algorithm and Scoring Function with Side Chain Conformational Entropy

Yujin Wu¹, Charles L Brooks 3rd^{1

2}

Affiliations

¹ Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States.
² Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States.

PMID: 34704754
PMCID: PMC8684595
DOI: 10.1021/acs.jcim.1c01078

Flexible CDOCKER: Hybrid Searching Algorithm and Scoring Function with Side Chain Conformational Entropy

Yujin Wu et al. J Chem Inf Model. 2021.

. 2021 Nov 22;61(11):5535-5549.

doi: 10.1021/acs.jcim.1c01078. Epub 2021 Oct 27.

Authors

Yujin Wu¹, Charles L Brooks 3rd^{1

2}

Affiliations

¹ Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States.
² Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States.

PMID: 34704754
PMCID: PMC8684595
DOI: 10.1021/acs.jcim.1c01078

Abstract

The binding of small-molecule ligands to protein or nucleic acid targets is important to numerous biological processes. Accurate prediction of the binding modes between a ligand and a macromolecule is of fundamental importance in structure-based structure-function exploration. When multiple ligands with different sizes are docked to a target receptor, it is reasonable to assume that the residues in the binding pocket may adopt alternative conformations upon interacting with the different ligands. In addition, it has been suggested that the entropic contribution to binding can be important. However, only a few attempts to include the side chain conformational entropy upon binding within the application of flexible receptor docking methodology exist. Here, we propose a new physics-based scoring function that includes both enthalpic and entropic contributions upon binding by considering the conformational variability of the flexible side chains within the ensemble of docked poses. We also describe a novel hybrid searching algorithm that combines both molecular dynamics (MD)-based simulated annealing and genetic algorithm crossovers to address the enhanced sampling of the increased search space. We demonstrate improved accuracy in flexible cross-docking experiments compared with rigid cross-docking. We test our developments by considering five protein targets, thrombin, dihydrofolate reductase(DHFR), T4 L99A, T4 L99A/M102Q, and PDE10A, which belong to different enzyme classes with different binding pocket environments, as a representative set of diverse ligands and receptors. Each target contains dozens of different ligands bound to the same binding pocket. We also demonstrate that this flexible docking algorithm may be applicable to RNA docking with a representative riboswitch example. Our findings show significant improvements in top ranking accuracy across this set, with the largest improvement relative to rigid, 23.64%, occurring for ligands binding to DHFR. We then evaluate the ability to identify lead compounds among a large chemical space for the proposed flexible receptor docking algorithm using a subset of the DUD-E containing receptor targets MCR, GCR, and ANDR. We demonstrate that our new algorithms show improved performance in modeling flexible binding site residues compared to DOCK. Finally, we select the T4 L99A and T4 L99A/M102Q decoy sets, containing dozens of binders and experimentally validated nonbinders, to test our approach in distinguishing binders from nonbinders. We illustrate that our new algorithms for searching and scoring have superior performance to rigid receptor CDOCKER as well as AutoDock Vina. Finally, we suggest that flexible CDOCKER is sufficiently fast to be utilized in high-throughput docking screens in the context of hierarchical approaches.

PubMed Disclaimer

Figures

**Figure 1:**
Receptor binding pocket complexity. **(A)** Distribution of number of receptor flexible side chains. **(B)** Distribution of number of receptor rotatable bonds.

**Figure 2:**
2OUN flexible self-docking results using the flexible receptor docking algorithm. These two docking poses (pink) belong to the same cluster and are native-like docking poses. The corresponding flexible side chain GLU726 adopts two different conformations. The backbone atoms of the two GLU726 conformations are shown in orange, while the side chain atoms are shown in yellow and blue, respectively. The side chain amide groups adopt two different orientations.

**Figure 3:**
Flexible docking searching algorithm.

**Figure 4:**
Average RMSD distribution of ligand docking poses in the initial generation. The RMSD values are binned with a 0.5 increment.

**Figure 5:**
Average population of native-like poses vs generation. Population for each docking measurement is calculated by dividing number of native-like poses by 500 (the number of trials in a generation). Average population of native-like poses for all 6 datasets is plotted with their corresponding error bars constructed by computing the standard deviation.

**Figure 6:**
Cumulative docking accuracy for the (A) T4 L99A dataset, the (B) T4 L99A/M102Q dataset and the (C) Riboswitch dataset. Distribution of ligand properties: (D) rotatable bonds, (E) logP and (F) molecular weight. A rank of N means the correct docking pose is within the top N solutions. AutoDock Vina trials use an exhaustiveness of 20. Rigid CDOCKER uses 500 docking trials for each cross-docking experiment.

**Figure 7:**
Cumulative docking accuracy for the (A) PDE10A dataset, the (B) DHFR dataset and the (C) Thrombin dataset. Distribution of ligand property: (D) rotatable bonds, (E) logP and (F) molecular weight. A Rank of N means the correct docking pose is within the top N solutions. AutoDock Vina trials use an exhaustiveness of 20. Rigid CDOCKER uses 500 docking trials for each cross-docking experiment.

**Figure 8:**
Cumulative docking accuracy for the (A) DHFR dataset and the (B) Thrombin dataset with some ligands starting with their native pose internal conformations. Top rank pose prediction accuracy of flexible docking is 74.54% and 74.18% using the proposed flexible receptor docking algorithm for each dataset, respectively.

**Figure 9:**
Properties of compounds in T4 L99A decoy set and T4 L99A/M102Q decoy set.

See this image and copyright information in PMC

References

1. Basak SC Chemobioinformatics: the advancing frontier of computer-aided drug design in the post-genomic era. Curr Comput Aided Drug Des 2012, 8, 1–2. - PubMed
1. Kitchen DB; Decornez H; Furr JR; Bajorath J Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discov. 2004, 3, 935–949. - PubMed
1. Yuriev E; Agostino M; Ramsland PA Challenges and advances in computational docking: 2009 in review. J. Mol. Recognit. 2011, 24, 149–164. - PubMed
1. Taylor RD; Jewsbury PJ; Essex JW A review of protein-small molecule docking methods. J. Comput. Aided. Mol. Des. 2002, 16, 151–166. - PubMed
1. Su M; Yang Q; Du Y; Feng G; Liu Z; Li Y; Wang R Comparative assessment of scoring functions: The CASF-2016 update. J. Chem. Inf. Model. 2018, 59, 895–913. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Flexible CDOCKER: Hybrid Searching Algorithm and Scoring Function with Side Chain Conformational Entropy

Affiliations

Flexible CDOCKER: Hybrid Searching Algorithm and Scoring Function with Side Chain Conformational Entropy

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources