Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 9;19(9):2535-2556.
doi: 10.1021/acs.jctc.2c01087. Epub 2023 Apr 24.

WaterKit: Thermodynamic Profiling of Protein Hydration Sites

Affiliations

WaterKit: Thermodynamic Profiling of Protein Hydration Sites

Jerome Eberhardt et al. J Chem Theory Comput. .

Abstract

Water desolvation is one of the key components of the free energy of binding of small molecules to their receptors. Thus, understanding the energetic balance of solvation and desolvation resulting from individual water molecules can be crucial when estimating ligand binding, especially when evaluating different molecules and poses as done in High-Throughput Virtual Screening (HTVS). Over the most recent decades, several methods were developed to tackle this problem, ranging from fast approximate methods (usually empirical functions using either discrete atom-atom pairwise interactions or continuum solvent models) to more computationally expensive and accurate ones, mostly based on Molecular Dynamics (MD) simulations, such as Grid Inhomogeneous Solvation Theory (GIST) or Double Decoupling. On one hand, MD-based methods are prohibitive to use in HTVS to estimate the role of waters on the fly for each ligand. On the other hand, fast and approximate methods show an unsatisfactory level of accuracy, with low agreement with results obtained with the more expensive methods. Here we introduce WaterKit, a new grid-based sampling method with explicit water molecules to calculate thermodynamic properties using the GIST method. Our results show that the discrete placement of water molecules is successful in reproducing the position of crystallographic waters with very high accuracy, as well as providing thermodynamic estimates with accuracy comparable to more expensive MD simulations. Unlike these methods, WaterKit can be used to analyze specific regions on the protein surface, (such as the binding site of a receptor), without having to hydrate and simulate the whole receptor structure. The results show the feasibility of a general and fast method to compute thermodynamic properties of water molecules, making it well-suited to be integrated in high-throughput pipelines such as molecular docking.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Schematic representation of the four main steps of the WaterKit sampling protocol: 1) identification of anchor points, and affinity grid maps preparation 2) placement order (a) and sampling (b and c) methods used to generate an ensemble of discrete water molecule conformations; 3) minimization of the fully hydrated system; 4) the estimation of the thermodynamic properties; of water molecules using GIST.
Fig. 2.
Fig. 2.
Difference between spherical maps using (A) AutoDock-Vina non-directional donor-acceptor oxygen probe (O_DA) and (B) TIP3P water model for the quaternary amine group (R-NH3+) in lysine residues. The spherical maps were contoured at 85 % of the most favorable energy value, corresponding to −0.22 and −11.56 kcal/mol for the AutoDock-Vina O_DA probe and TIP3P model, respectively.
Fig. 3.
Fig. 3.
Results for each WaterKit model tested and averaged over all the studied systems (except PDB 1hpx for hivpr protein), varying the hydrogen bond anchor points definition (APpolar: polar HB donor/acceptors only; APall: polar and non-polar hydrogen atoms), the spherical water model (TIP3P water, Vina oxygen donor-acceptor) and the number of minimization steps. All the hydration sites identified in the ligand binding pocket were compared to MD simulations. The water placement with a distance cutoff of 1.0 Å was evaluated using the True Positive Rate (TPR) (sensitivity), the Positive Predictive Value (PPV) (precision). The different GIST energy components, Esw,Eww,TStrans,TSorient as well as ΔG, were compared using the Root Mean Square Deviation (RMSD) and the coefficient of determination (r2).
Fig. 4.
Fig. 4.
(A) Results for each system studied using the best WaterKit model: polar Hydrogen Bond Anchors (HBA); positions identified using the Vina O_DA spherical water model interactions relaxed with 100 steps of minimization. All hydration sites identified in the ligand binding pocket were compared to MD simulations. The number of True Positive (TP) corresponds to the total number of water molecules correctly identified by WaterKit, using a distance cutoff of 1.0 Å. The different GIST energy components, Esw,Eww,TStrans.,TSorient. as well as ΔG, were compared using the Root Mean Square Deviation (RMSD) and the coefficient of determination (r2). RMSD and r2 calculated for the MD triplicates is reported as a reference. (B) The water placement was evaluated using the True Positive Rate (TPR) (sensitivity) and the Positive Predictive Value (PPV) (precision) metrics. The sensitivity and precision were calculated on all the identified hydration sites, as well on different set of hydration sites based on their Esw energy from −5 to −2 kcal/mol.
Fig. 5.
Fig. 5.
Hydration sites found in ital ligand binding site (PDB: 2ica) using MD simulations (triplicates) and WaterKit. Hydration sites found in MD simulation are represented as blue spheres. Hydration sites found with WaterKit are colored in green (found an equivalent in MD simulations) and red (no equivalent found in MD simulations) and represented as spheres. The protein is shown as cartoon (white) with side-chains in sticks. The co-crystallized ligand is colored in orange and represented in stick.
Fig. 6.
Fig. 6.
Hydration sites found in hsp90α ligand binding site using MD simulations (triplicates) and WaterKit (PDB: 1uyg). (A) comparison of MD and WaterKit predictions; (B) WaterKit hydration sites near the purine ring; (C) MD hydration sites near the purine ring. Hydration sites found with MD are represented by blue spheres. Hydration sites found with WaterKit shown as spheres colored in green (if found also in MD simulations) and red (no equivalent found in MD simulations) and represented by spheres. Spheres are labelled by their calculated enthalpy solute-water energy (Esw) in kcal/mol. The protein is shown in white with cartoons for the secondary structure and sticks for side chains and the bound ligand. Hydrogen bonds are shown as yellow dotted lines.
Fig. 7.
Fig. 7.
Hydration sites found around the crystallographic water W301 in hivpr (PDB: 2zye) using (A) WaterKit and (B) MD simulations (triplicates). Hydration sites found are represented by spheres colored in blue (found in MD simulation), green (found by WaterKit and present in MD simulations) and red (found by WaterKit but not present in MD simulations). Spheres are labelled by their calculated free energy (ΔG) in kcal/mol. The protein is colored in white and represented in cartoon secondary structure and side-chains in sticks. The co-crystallized inhibitor KNI-272 is shown as orange sticks. Hydrogen bonds are represented by yellow dotted lines. Oxygen density (gO) map from WaterKit and MD simulation averages, contoured at 5.0 bulk density, is colored in green cyan and deep purple, respectively. (C) Network of hydration sites predicted by WaterKit within 1.5 Å from the heavy atoms of ligand KNI-272 bound in HIV-1 PR. Hydration sites found by Waterkit are represented by spheres and colored in green (if also found in MD simulations) and red (no equivalent found in MD simulations). KNI-272 ligand is shown as gray sticks. Hydration sites overlapping with key hydrogen bond features are surrounded by an additional transparent sphere.
Fig. 8.
Fig. 8.
Density expressed in bulk density unit in the ligand binding pocket of fabp4 (A) during MD simulations (triplicates) of 200 ns long each and (B) in the ensemble of 10,000 frames generated with WaterKit. The average density in WaterKit and MD simulations are represented by dotted red lines. The histograms represent the observed probability distribution of the density.
Fig. 9.
Fig. 9.
Comparison between (A) conserved crystallographic water molecules found in apo structures and (B-F) hydration sites predicted with WaterKit in the neuraminidase active site. Crystallographic water molecules are shown in red; all water molecules predicted by WaterKit are in gray, and water molecules are shown and WaterKit hydration sites are shown as red and green spheres, respectively. (A) Sialic acid bound to neuraminidase (PDB id:2bat ) with crystallographic water molecules from apo neuraminidase structures (PDB entries 3nn9, 4nn9, 5nn9, 6nn9, 6crd, 6d3b, 6mcx) within 1.5 of ligand heavy atoms shown in in B-F. (B) Sialic acid (PDB id: 2bat) superposed to WaterKit results. (C) Oseltamivir (PDB id: 2ht7, ). (D) Zanamivir (PDB id: 3ckz ). (E) FeqGuDFSA (PDB id: 3w09 ). (F) BANA206 (PDB id: 1b9v )
Fig. 10.
Fig. 10.
Hydration sites found (A) forming a five-membered ring-like structure in the streptavidin ligand binding site and (B) overlapping with biotin in the ligand binding site using MD simulations (triplicates) and WaterKit (WK). Hydration sites found in MD simulation are represented by blue spheres. Hydration sites found with WaterKit are colored in green (found an equivalent in MD simulations) and red (no equivalent found in MD simulations) and represented by spheres. The spheres are labelled by their enthalpy solute-water energy (Esw) in kcal/mol. Crystallographic water molecules found in apo structures of the streptavidin (pdb ids: 7knk, 7ek8 and 7ek9) are represented by crosses colored in red. The protein is colored in white and represented in cartoon with side-chains in sticks. The oxygen density (gO) map from WaterKit and MD simulations, contoured at 5.0 bulk density, is colored in green cyan and deep purple, respectively. The gO map for MD simulations was obtained by average the gO map from each MD replicates. Hydrogen bonds between hydration sites in MD are represented by yellow dotted lines.

Similar articles

Cited by

References

    1. Bellissent-Funel Marie-Claire, Hassanali Ali, Havenith Martina, Henchman Richard, Pohl Peter, Sterpone Fabio, van der Spoel David, Xu Yao, and Garcia Angel E., “Water determines the structure and dynamics of proteins”, 116(13), pp. 7673–7697. - PMC - PubMed
    1. Ben-Naim A, “Molecular recognition–viewed through the eyes of the solvent”, 101–102, pp. 309–319. - PubMed
    1. Spyrakis Francesca, Ahmed Mostafa H., Bayden Alexander S., Cozzini Pietro, Mozzarelli Andrea, and Kellogg Glen E., “The roles of water in the protein matrix: A largely untapped resource for drug discovery”, 60(16), pp. 6781–6827. - PubMed
    1. Dunitz Jack D., “The entropic cost of bound water in crystals and biomolecules”, 264(5159), pp. 670–670. - PubMed
    1. Haider Kamran, Wickstrom Lauren, Ramsey Steven, Gilson Michael K., and Kurtzman Tom, “Enthalpic breakdown of water structure on protein active-site surfaces”, 120(34), pp. 8743–8756. - PMC - PubMed

LinkOut - more resources