Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 9;13(11):1633.
doi: 10.3390/biom13111633.

Assessing Genetic Algorithm-Based Docking Protocols for Prediction of Heparin Oligosaccharide Binding Geometries onto Proteins

Affiliations

Assessing Genetic Algorithm-Based Docking Protocols for Prediction of Heparin Oligosaccharide Binding Geometries onto Proteins

Samuel G Holmes et al. Biomolecules. .

Abstract

Although molecular docking has evolved dramatically over the years, its application to glycosaminoglycans (GAGs) has remained challenging because of their intrinsic flexibility, highly anionic character and rather ill-defined site of binding on proteins. GAGs have been treated as either fully "rigid" or fully "flexible" in molecular docking. We reasoned that an intermediate semi-rigid docking (SRD) protocol may be better for the recapitulation of native heparin/heparan sulfate (Hp/HS) topologies. Herein, we study 18 Hp/HS-protein co-complexes containing chains from disaccharide to decasaccharide using genetic algorithm-based docking with rigid, semi-rigid, and flexible docking protocols. Our work reveals that rigid and semi-rigid protocols recapitulate native poses for longer chains (5→10 mers) significantly better than the flexible protocol, while 2→4-mer poses are better predicted using the semi-rigid approach. More importantly, the semi-rigid docking protocol is likely to perform better when no crystal structure information is available. We also present a new parameter for parsing selective versus non-selective GAG-protein systems, which relies on two computational parameters including consistency of binding (i.e., RMSD) and docking score (i.e., GOLD Score). The new semi-rigid protocol in combination with the new computational parameter is expected to be particularly useful in high-throughput screening of GAG sequences for identifying promising druggable targets as well as drug-like Hp/HS sequences.

Keywords: glycosaminoglycans; heparin/heparan sulfate; knowledge-based docking; molecular docking.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Types of docking protocols used in the literature. (A) Two types of protocols typically used for GAGs in the literature including “rigid” and “flexible” docking. This work reports comparative studies of these two with a semi-rigid docking protocol, which affords better predicting success, especially for longer GAG sequences. Whereas the rigid and flexible docking approaches hold glycosidic torsions (Φ and Ψ) either completely invariant (±0°) or fully flexible (±180°), the semi-rigid protocol allows partial flexibility (±30°) around the most preferred torsions reported in the literature. (B) A Ramachandran plot depicting Φ and Ψ for all 41 Hp/HS–protein complexes available in the Protein Data Bank (www.rcsb.org, accessed on 1 July 2021). Φ and Ψ are categorized into two groups: acid–amine (UA→GlcN) and amine–acid (GlcN→UA).
Figure 2
Figure 2
Recapitulation of the native pose using a rigid docking protocol. Each sequence was redocked back into the crystal structure in triplicate using 100 GA runs, each being allowed 100,000 genetic operations. The top two poses from each replicate experiment were selected, compiled and used for analysis. (A) The docking of each Hp/HS oligosaccharide onto its target protein was analyzed by calculating the RMSDAVERAGE, RMSDLOWEST and RMSDINTRAPOSE, which convey the root mean square difference (RMSD) between the native pose and the average of the top six rigid docking poses, the difference between the native pose and the one docked pose that most closely matches the native, and the intra-pose difference between the six docked poses, respectively. (B) Representative example of a successful recapitulation (RMSDAVERAGE ≤ 2.5 Å) of the native pose of an Hp/HS tetrasaccharide (left; 6LJL) and a not-so-successful predication (RMSDAVERAGE > 2.5 Å) of the native pose an Hp/HS disaccharide (right; 1U4L). Native poses in both are shown in green, while docked poses are in orange. (CE) Three different RMSDs as function IDs of the co-complex structures reported in the PDB. X-axis labels represent the PDB code followed by chain length in brackets. The red dotted line indicates the 2.5 Å cut-off.
Figure 3
Figure 3
Recapitulation of the native pose using a flexible docking protocol. Each sequence was redocked back into the crystal structure in triplicate using 100 GA runs, each being allowed 100,000 genetic operations. The top two poses from each replicate experiment were selected, compiled and used for analysis. (A) The flexible docking protocol affords full flexibility to glycosidic bonds and ring substituents (shown in red). A typical HS hexasaccharide encompasses more than 36 rotatable bonds arising from a minimum of 5 and 7 rotatable bonds in UA and GlcN residues (labeled 1→7 in blue). Ring puckers are held invariant from their starting state in the flexible docking protocol. (B) As expected, GOLD dock time for fully flexible docking increased linearly with chain length; although this does not imply that flexible docking yields recapitulation of the native pose (see text for details). (CE) Three different RMSDs as function IDs of the co-complex structures reported in the PDB. X-axis labels represent the PDB code followed by chain length in brackets. Red dotted line indicates the 2.5 Å cut-off.
Figure 4
Figure 4
Recapitulation of the native pose using a semi-rigid docking (SRD) protocol. As for rigid and flexible protocols, each sequence was redocked back into the crystal structure in triplicate using 100 GA runs, each being allowed 100,000 genetic operations. (AC) Three different RMSDs as function IDs of the co-complex structures reported in the PDB. X-axis labels represent the PDB code followed by chain length in brackets. The red dotted line indicates the 2.5 Å cut-off. (D) Successful recapitulation of native poses of 3UAN and 1E0O by rigid (left) and semi-rigid (middle) docking protocols but not by the flexible docking protocol (right). Docked poses (shown in orange) are superimposed on native poses (green) for the two sequences. The protein ribbon is shown in light grey.
Figure 5
Figure 5
Variation in observed Φ/Ψ from the native pose following semi-rigid and flexible dockings. Shown are Δφ and Δψ, the differences in φ (A,C,E) and ψ (B,D,F) between the native pose and the average of the docked poses obtained following semi-rigid (A,C) and flexible (B,D) dockings, respectively, as a function of the co-complex structure and glycosidic bonds (2→1, 3→2, etc., where 2→1 refers to the glycosidic bond between the reducing end residue #1 and the penultimate residue #2). Although the difference (Δφ and Δψ) could be either negative or positive, only the magnitude is shown (i.e., mod of Δφ and Δψ). (E,F) show the average Δφ and Δψ, respectively, across all 18 sequences from di- to decasaccharide for SRD and flexible docking protocols. See text for details.
Figure 6
Figure 6
Comparison of docked poses with the native pose of disaccharides. The disaccharide sequence from 1U4L, 1U4M and 3B9F was docked onto the protein in triplicate using 100 GA runs (100,000 genetic operations) using either rigid, semi-rigid or flexible docking protocols. The top two poses from each replicate experiment were selected, compiled and used for visualization. The native pose is shown in green. Docked poses are in orange.
Figure 7
Figure 7
Identifying high-affinity, high-selectivity GAG sequences. (A) Average GOLD Scores calculated for six docked poses following rigid, semi-rigid and flexible docking of the 18 Hp/HS sequences onto their target proteins. GOLD Scores reported here were calculated for the protocols with 100 GA runs, each with 100,000 operations. X-axis labels represent the PDB code followed by chain length in brackets. Errors show the standard deviation of scores observed in triplicate docking experiments. (B) A plot of the ratio of the average GOLD Score to RMSDAVERAGE for rigid, semi-rigid and flexible docking of the 18 Hp/HS sequences. The black dotted line shows an arbitrary cut-off (Ratio ≥ 50) that can be used to identify high-affinity, high-selectivity sequences. From the 18 sequences studied here, only one tetra- (6LJL), one penta- (1TB6), two hexa- (4AK2 and 3UAN), one octa- (3INA) and one decasaccharide (1E0O) are predicted to pass the threshold.

Similar articles

Cited by

References

    1. Jones G., Willett P., Glen R.C., Leach A.R., Taylor R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 1997;267:727–748. doi: 10.1006/jmbi.1996.0897. - DOI - PubMed
    1. Kuntz I.D., Blaney J.M., Oatley S.J., Langridge R., Ferrin T.E. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 1982;161:269–288. doi: 10.1016/0022-2836(82)90153-X. - DOI - PubMed
    1. Morris G.M., Ruth H., Lindstrom W., Sanner M.F., Belew R.K., Goodsell D.S., Olson A.J. Software news and updates AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 2009;30:2785–2791. doi: 10.1002/jcc.21256. - DOI - PMC - PubMed
    1. Fosgerau K., Hoffmann T. Peptide therapeutics: Current status and future directions. Drug Discov. Today. 2015;20:122–128. doi: 10.1016/j.drudis.2014.10.003. - DOI - PubMed
    1. Kazmirchuk T.D.D., Bradbury-Jost C., Withey T.A., Gessese T., Azad T., Samanfar B., Dehne F., Golshani A. Peptides of a feather: How computation Is taking peptide therapeutics under its wing. Genes. 2023;14:1194. doi: 10.3390/genes14061194. - DOI - PMC - PubMed

Publication types