. 2015 Jun;29(6):485-509.

doi: 10.1007/s10822-015-9846-3. Epub 2015 May 5.

Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock

Ann E Cleves¹, Ajay N Jain

Affiliations

PMID: 25940276
PMCID: PMC4464052
DOI: 10.1007/s10822-015-9846-3

Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock

Ann E Cleves et al. J Comput Aided Mol Des. 2015 Jun.

. 2015 Jun;29(6):485-509.

doi: 10.1007/s10822-015-9846-3. Epub 2015 May 5.

Authors

Ann E Cleves¹, Ajay N Jain

Affiliation

¹ Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA.

PMID: 25940276
PMCID: PMC4464052
DOI: 10.1007/s10822-015-9846-3

Abstract

Prediction of the bound configuration of small-molecule ligands that differ substantially from the cognate ligand of a protein co-crystal structure is much more challenging than re-docking the cognate ligand. Success rates for cross-docking in the range of 20-30 % are common. We present an approach that uses structural information known prior to a particular cutoff-date to make predictions on ligands whose bounds structures were determined later. The knowledge-guided docking protocol was tested on a set of ten protein targets using a total of 949 ligands. The benchmark data set, called PINC ("PINC Is Not Cognate"), is publicly available. Protein pocket similarity was used to choose representative structures for ensemble-docking. The docking protocol made use of known ligand poses prior to the cutoff-date, both to help guide the configurational search and to adjust the rank of predicted poses. Overall, the top-scoring pose family was correct over 60 % of the time, with the top-two pose families approaching a 75 % success rate. Correct poses among all those predicted were identified nearly 90 % of the time. The largest improvements came from the use of molecular similarity to improve ligand pose rankings and the strategy for identifying representative protein structures. With the exception of a single outlier target, the knowledge-guided docking protocol produced results matching the quality of cognate-ligand re-docking, but it did so on a very challenging temporally-segregated cross-docking benchmark.

PubMed Disclaimer

Figures

**Fig. 1**
The structural information available about CDK2 inhibitors prior to July 2003: all 42 protein structures shown with five small ligands in the active site (*top*); all 42 ligands oriented with their hinge-binding moieties upward (*bottom left*); and all 42 ligands with the viewpoint from the kinase hinge (*bottom right*)

**Fig. 2**
Temporal partitioning for cross-docking prediction: the information above the dotted line became publicly available prior to 6-27-2003 and is to be used to make predictions on ligands whose bound configurations was determined later

**Fig. 3**
The four binding sites with the largest volumes

**Fig. 4**
The protein binding sites with volume sizes ranked 5–8

**Fig. 5**
Dynamic substructure matching of a ligand to be docked to ligands whose bound poses are known is used to provide additional search focus on known binding motifs. Note that the ligand conformations shown are from the respective crystal structures to illustrate the relative alignments of known fragments to those present in the subject ligand

**Fig. 6**
Molecular similarity to bound ligands of different scaffold types can help in pose-ranking

**Fig. 7**
The top scoring pose families for the 2XNB ligand for knowledge-guided docking (a marked “G-”) and unguided docking (b marked “U-”). The crystallographic pose (two alternates) are shown in *thick tan sticks* with the docked pose families shown in contrasting color

**Fig. 8**
Overall performance of Surflex-Dock under different docking protocols (*left*) and considering the effect of using a protein ensemble (*right*) compared with a single protein variant for each target (aggregated over five different selections each)

**Fig. 9**
Overall docking performance under different docking protocols for nine targets. The key curves are blue (top scoring pose family using the knowledge-guided protocol), magenta (top two pose families), and red (top pose family in the unguided protocol)

**Fig. 10**
The effects of using single protein exemplars versus an ensemble of five for nine targets. The key comparisons are between the red curve (protein ensemble with no substructural guidance during docking), blue curve (adding substructural guidance), and the remaining curves (each from a single protein variant from the ensemble, using no substructural guidance)

**Fig. 11**
The experimental electron density (*red mesh*) for the 2XNB ligand is shown along with that computed for the entire top-scoring pose family ensemble (*thin cyan sticks* with *blue* transparent surface, a) and for the two alternate poses modeled in the crystallographic experiment (*thick tan sticks* with *gray* transparent surface, b)

**Fig. 12**
For thrombin, comparison between docking without knowledge-based guidance (*top right*, *pink*) and with guidance (*bottom*, *cyan*) for the ligand of 1ZGV, a triazolo-pyrimidine with a non-basic S1 binding element, (*top left* in 2D and thick tan sticks in its experimental pose)

**Fig. 13**
Comparison between without knowledge-based guidance (*top*) and with guidance (*bottom*) for the thrombin ligand within 3RML, showing the top two pose families in the guided case

**Fig. 14**
The HIV-RT pocket volume (*top*) is shown along with docking results for the ligand of 3LAL: *tan sticks* for the experimental pose, *cyan* for the top-scoring pose family with knowledge-based guidance, and pink for the second-ranked family

**Fig. 15**
HIV-PR (*magenta*), shown in top view (*left*) and side view (*right*), with the top-scoring predicted pose family for the ligand of 1ZSR (*cyan* with experimental pose in *tan*)

**Fig. 16**
PTP1b (*pink*) is shown with the ligand of 1Q6S (*tan sticks*), the relevant known molecules sharing the difuoromethylphosphate (*green*), and the top two predicted pose families (*cyan* and *light magenta*)

**Fig. 17**
Canonical early bound ligands of $PPAR γ$ with a shared binding mode (*tan*), along with the nine worst test cases (*cyan*), all exhibiting a completely different binding mode for organic acids

See this image and copyright information in PMC

References

1. Kuntz I, Blaney J, Oatley S, Langridge R, Ferrin T. A geometric approach to macromolecule-ligand interactions. J Mol Biol. 1982;161(2):269–288. doi: 10.1016/0022-2836(82)90153-X. - DOI - PubMed
1. Goodsell D, Olson A. Automated docking of substrates to proteins by simulated annealing. Proteins Struct Funct Bioinform. 1990;8(3):195–202. doi: 10.1002/prot.340080302. - DOI - PubMed
1. Jones G, Willett P, Glen RC. A genetic algorithm for flexible molecular overlay and pharmacophore elucidation. J Comput Aided Mol Des. 1995;9(6):532–549. doi: 10.1007/BF00124324. - DOI - PubMed
1. Jones G, Willett P, Glen R, Leach A, Taylor R. Development and validation of a genetic algorithm for flexible docking. J Mol Biol. 1997;267(3):727–748. doi: 10.1006/jmbi.1996.0897. - DOI - PubMed
1. Welch W, Ruppert J, Jain AN. Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol. 1996;3(6):449–462. doi: 10.1016/S1074-5521(96)90093-9. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock

Affiliation

Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources