Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul 26;55(14):6582-94.
doi: 10.1021/jm300687e. Epub 2012 Jul 5.

Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking

Affiliations
Free PMC article

Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking

Michael M Mysinger et al. J Med Chem. .
Free PMC article

Abstract

A key metric to assess molecular docking remains ligand enrichment against challenging decoys. Whereas the directory of useful decoys (DUD) has been widely used, clear areas for optimization have emerged. Here we describe an improved benchmarking set that includes more diverse targets such as GPCRs and ion channels, totaling 102 proteins with 22886 clustered ligands drawn from ChEMBL, each with 50 property-matched decoys drawn from ZINC. To ensure chemotype diversity, we cluster each target's ligands by their Bemis-Murcko atomic frameworks. We add net charge to the matched physicochemical properties and include only the most dissimilar decoys, by topology, from the ligands. An online automated tool (http://decoys.docking.org) generates these improved matched decoys for user-supplied ligands. We test this data set by docking all 102 targets, using the results to improve the balance between ligand desolvation and electrostatics in DOCK 3.6. The complete DUD-E benchmarking set is freely available at http://dude.docking.org.

PubMed Disclaimer

Figures

Figure 1
Figure 1
DUD-E target classification. Number of the 102 targets that belong to eight broad protein categories.
Figure 2
Figure 2
Ligand clustering. (A) The seventh largest Murcko cluster of kinesin-like protein 1 (KIF11), showing both the scaffold (left) and all seven member ligands. (B) Number of ligands in each of the 70 KIF11 Bemis–Murcko atomic frameworks. We removed lower affinity compounds over-represented clusters (above the line), while retaining 100 ligands. (C) Number of adenosine A2A receptor (AA2AR) Murcko clusters is plotted against affinity threshold. Fewer than 600 clusters are present using a 30 nM affinity threshold.
Figure 3
Figure 3
Decoy generation. (A) Three key “warhead” groups from factor Xa (FA10), glycinamide ribonucleotide transformylase (PUR2), and thymidine kinase (KITH). (B) Fraction of warheads remaining is plotted against the dissimilarity method. The dissimilarity methods consist of a fingerprint (Daylight or ECFP4) and either a hard cutoff or a fraction of the most dissimilar decoys to be retained. (C) Property distributions of estrogen receptor α (ESR1) for both the 383 ligands (blue) and the 20685 property-matched decoys (red).
Figure 4
Figure 4
Retrospective enrichment comparing ligand desolvation and electrostatics methods. Docking results over DUD-E as measured by LogAUC. “None” has no ligand desolvation term, “SEV” uses solvent-excluded volume ligand desolvation, “Thin” employs a thin low-dielectric layer in the electrostatic calculations.
Figure 5
Figure 5
Representative ROC plots. ROC plots using no desolvation (None), solvent-excluded volume ligand desolvation (SEV), the thin low-dielectric layer (Thin), or a drug-like background that consists of all ChEMBL12 ligands with affinities better than 10 μM (Drug-like). The black dotted line represents the results expected from docking ligands randomly. LogAUC percentages are reported in the legend text.
Figure 6
Figure 6
Representative docking poses. The crystallographic ligand was rebuilt and docked from scratch. (A–F) The crystal pose (magenta) is compared to the resulting docked pose (green). In (C), more ligand conformations are generated and the redocked pose is also shown (tan). Key hydrogen bonds are shown by black dotted lines, and the partially transparent protein surface is colored by atom type.

Similar articles

Cited by

References

    1. Kitchen D. B.; Decornez H.; Furr J. R.; Bajorath J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nature Rev. Drug Discovery 2004, 3, 935–949. - PubMed
    1. Kolb P.; Rosenbaum D. M.; Irwin J. J.; Fung J. J.; Kobilka B. K.; Shoichet B. K. Structure-based discovery of beta(2)-adrenergic receptor ligands. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 6843–6848. - PMC - PubMed
    1. Mysinger M. M.; Weiss D. R.; Ziarek J. J.; Gravel S.; Doak A. K.; Karpiak J.; Heveker N.; Shoichet B. K.; Volkman B. F. Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 5517–5522. - PMC - PubMed
    1. Gruneberg S.; Stubbs M. T.; Klebe G. Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation. J. Med. Chem. 2002, 45, 3588–3602. - PubMed
    1. Jain A. N.; Nicholls A. Recommendations for evaluation of computational methods. J. Comput.-Aided Mol. Des. 2008, 22, 133–139. - PMC - PubMed

Publication types