Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jun;23(6):355-74.
doi: 10.1007/s10822-009-9266-3. Epub 2009 Apr 2.

Effects of protein conformation in docking: improved pose prediction through protein pocket adaptation

Affiliations

Effects of protein conformation in docking: improved pose prediction through protein pocket adaptation

Ajay N Jain. J Comput Aided Mol Des. 2009 Jun.

Abstract

Computational methods for docking ligands have been shown to be remarkably dependent on precise protein conformation, where acceptable results in pose prediction have been generally possible only in the artificial case of re-docking a ligand into a protein binding site whose conformation was determined in the presence of the same ligand (the "cognate" docking problem). In such cases, on well curated protein/ligand complexes, accurate dockings can be returned as top-scoring over 75% of the time using tools such as Surflex-Dock. A critical application of docking in modeling for lead optimization requires accurate pose prediction for novel ligands, ranging from simple synthetic analogs to very different molecular scaffolds. Typical results for widely used programs in the "cross-docking case" (making use of a single fixed protein conformation) have rates closer to 20% success. By making use of protein conformations from multiple complexes, Surflex-Dock yields an average success rate of 61% across eight pharmaceutically relevant targets. Following docking, protein pocket adaptation and rescoring identifies single pose families that are correct an average of 67% of the time. Consideration of the best of two pose families (from alternate scoring regimes) yields a 75% mean success rate.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Two PDE4b structures (1R09 and 1RoR) are shown superimposed. The former, shown in red is in complex with 8-bromo-AMP (yellow carbons), and the latter, shown in blue, is in complex with AMP (green carbons). The single atom change from hydrogen to bromine results in a complete flip of the adenosine, where a common hydrogen bond is made by different atoms on the heterocycle. The protein conformational shift is subtle, with the largest change being in the position of MET-431 (indicated with an arrow). However, docking into the cognate structure, the correct pose of 8-bromo-AMP is ranked much higher than when docking into the structure determined with AMP instead.
Figure 2
Figure 2
Cognate ligands for PDE4b, CDK2, and ESR1, with example test ligands shown below the line. Light shaded circles highlight corresponding moieties on the ligands within each target (H-bond acceptor interacting with the sidechain amide of GLN-443 within PDE4b, H-bond acceptor interacting with amide proton of LEU-83 within CDK2, hydroxyl interacting with the carboxylate of GLU-353 within ESR1). Note that a single atom change (hydrogen to bromine) causes a 180-degree moiety flip among the top two ligands of PDE4b (the pyrimidine flips relative to the remainder of the ligand in order to roughly superimpose the highlighted nitrogens).
Figure 3
Figure 3
Cognate ligands for F2 (thrombin), MAPK14, and MMP8, with example test ligands shown below the line. The highlighted moieties indicate corresponding functionality for each target (P1 element for F2, H-bond acceptor for the amide proton of MET-109 in MAPK14, and metal chelation element for MMP8).
Figure 4
Figure 4
Automatically chosen fragments of the cognate ligands for PDE4b. Each makes a critical interaction with GLN-443. The fragment on the lower right is able to make a hydrogen-bond with its cognate protein conformation (1XOT, shown in green), whereas the other three are able to make interactions with the same protein conformation (1RO9, shown in blue).
Figure 5
Figure 5
Performance of Surflex-Dock using default geometric docking parameters on the Astex Diverse set of 85 cognate protein/ligand complexes (left plot) and on the cross docking set of 211 novel ligands docked to eight different protein targets (right plot). Overall performance in cognate-docking for top scoring poses was 76% success at a 2Å rmsd threshold (cumulative histogram shown with solid line). The best pose of the top 20 was within 2Å 94% of the time (dashed line). This level of performance is statistically indistinguishable from that of GOLD from the original paper. For cross-docking, the comparable performance levels were 25% and 60% for top-scoring and best pose, respectively. Among the cases where a good pose existed among the top scoring, the success rate for cognate-docking was 81%, but for cross-docking was 42%, highlighting the difficulty in ranking among poses under the latter real-world conditions.
Figure 6
Figure 6
Protein flexibility is significantly different among the targets. At left, the 5 conformations of PDE4b are shown along with a single ligand. At right, 5 conformations of CDK2 are shown, also with a ligand. PDE4b exhibits very little movement overall and has relatively little backbone variation. CDK2 contrasts by exhibiting movement in all atoms.
Figure 7
Figure 7
The left plot shows performance of Surflex-Dock in cross docking for the top scoring returned pose for each of 211 non-cognate test ligands across 8 different targets. The right plot shows performance considering the best pose returned among the top 20. Performance shows a substantial improvement resulting from the use of multiple protein conformations over a single protein conformation (green and blue curves, respectively). Making use of fragments from the structures of the cognate ligands of the protein conformations leads to deeper exploration of the likely to be correct pose space, which improves performance further (blue curve). The multi-structure docking protocol with fragment hints is nearly as fast as the standard single-protein Surflex-Dock protocol with geometric search parameters.
Figure 8
Figure 8
Cross docking of the ligand from 1H08 into CDK2. At left, the protomol, cognate ligand, and a subfragment are shown. The protons from the steric protomol probes have been hidden, and the subfragment is shown with fat sticks. In the middle, two conformations from the top scoring pose family (resulting from heavy-atom pocket refinement) are shown along with the crystallographic alternatives (green, with the alternate amine positions numbered 1 and 2) and the fragment that helped guide the docking (blue). At right, the three protein structures that contributed to the final pose family are shown in blue (1DM2, 1H0W, and 1OIU), with the refinements due to post-docking optimization shown in red. There are significant movements in the protein that allow the recognition of this pose family as being optimal, particularly near the carboxylate by the arrow. Pocket refinement with protons yields an incorrect pose family, and the pose family without pocket refinement does not span both solutions.
Figure 9
Figure 9
At left is a depiction of the top scoring pose family for a ligand of CDK2. The portion of the ligand that is deep within the pocket (at top) exhibits relatively little variation, but the portion that extends toward solvent exhibits a variety of reasonable orientations. The crystallographic determination yielded two alternative conformations (shown in yellow and green), which are spanned by the pose family. At right is shown the comparison of using purely the top scoring pose of a ligand (red line) compared with using the top pose families from either the initial docking (green line), the result of post-docking pocket optimization with all protein atoms (blue line), or post-docking pocket optimization with protons only (purple line). The use of pose families makes only a nominal improvement at the 2Å level, but the physical depiction of pose variation is likely to be useful, as in this case where an accurate depiction of mobility is made.
Figure 10
Figure 10
The top plot shows the relationship between pose family agreement (see text) and the accuracy of the top scoring pose family from the non-rescored docking run. There is a very strong relationship (Kendall’s Tau 0.45, p ≪ 0.01 by permutation). The bottom plot shows the cumulative histogram of predicted single pose family accuracy for the cases in which pose family agreement is high (red) or low (green). In the high agreement cases (120/211 or 57%), the expectation is that 80% of the time, the top pose family contains the correct docking. Conversely, for the cases of disagreement (the remaining 43%), the success rate is closer to 25%. However, if we consider the top pose family for each of the three scoring methods (blue), our success rate doubles, to 50%. These differences are highly statistically significant (Fisher’s exact test on the difference of proportions of success/failure at 2.0Å rmsd). Note that the high-agreement cases involve ligands that do not differ in flexibility than the low-agreement cases (6.2 vs. 7.4)
Figure 11
Figure 11
Cross docking of the ligand from 1Y2H into PDE4b. At left, the top pose family from the proton-based pocket refinement (probability 1.00), is shown along with the crystallographic pose (green). There is a good deal of uncertainty in the placement of the chlorophenyl, which has an impact on the position of the remainder of the ligand. The center panel shows the original protein conformation (blue) and the modified one (red) that leads to the most dominant ligand pose from the pose family. Reorientation of a hydroxyl proton (TYR233, indicated by an arrow in the middle panel, at bottom on right) is critical to allow room for the ligand, and minor movement of a donor proton on GLN443 is also important in yielding correct recognition. The ligand extends well beyond the density in the area of the chlorophenyl (right), which suggests that alternative orientations are reasonable to propose.
Figure 12
Figure 12
Cross docking of the ligand from 1UOM into ESR1. In this case, all three scoring methods agreed on the top scoring pose family. At left, the crystallographic pose is shown with the pose family from heavy-atom pocket optimization. Only the antagonist structures (1YIM, 1SJ0, and 2ERT) contributed significantly to the pose family shown. In the middle, the protein atom movement is shown (red), which is minimal in this case. The ligand is relatively similar in structure and binding preference to the three cognate antagonists among the five structures used. At right, a pose resulting from docking to an agonist structure (1X7R). This pose is reasonable and close to correct, but the protein conformation resulting from heavy atom optimization cannot replicate the wholesale rearrangement of the protein (ASP351 is marked in both panels with a green arrow). LEU540 (labeled in the right panel) moves so much in the true antagonist-bound form that it does not appear in the middle depiction.
Figure 13
Figure 13
Cross docking of the ligand from 1FPC into F2. In this case, the original baseline docking yielded an incorrect top pose family, with the guanidinium correctly placed, but with the napthyl substituent significantly misplaced (shown at left). However, both methods of rescoring with protein pocket adaptation yielded the correct pose in the top family (middle panel). Accommodation of the ethyl-pyridine involves movement of TRP86 when heavy atoms are allowed to move (right panel). It is likely that ligand non-covalent self-interaction also contributes toward improved recognition in this and similar cases.

References

    1. Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand interactions. J Mol Biol. 1982;161(2):269–88. - PubMed
    1. Welch W, Ruppert J, Jain AN. Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol. 1996;3(6):449–62. - PubMed
    1. Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an incremental construction algorithm. J Mol Biol. 1996;261(3):470–89. - PubMed
    1. Jones G, Willett P, Glen RC, Leach AR, Taylor R. Development and validation of a genetic algorithm for flexible docking. J Mol Biol. 1997;267(3):727–48. - PubMed
    1. Goodsell DS, Morris GM, Olson AJ. Automated docking of flexible ligands: applications of AutoDock. J Mol Recognit. 1996;9(1):1–5. - PubMed

Publication types

MeSH terms

LinkOut - more resources