Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Dec 23;105(51):20239-44.
doi: 10.1073/pnas.0810818105. Epub 2008 Dec 10.

Solvent dramatically affects protein structure refinement

Affiliations

Solvent dramatically affects protein structure refinement

Gaurav Chopra et al. Proc Natl Acad Sci U S A. .

Abstract

One of the most challenging problems in protein structure prediction is improvement of homology models (structures within 1-3 A C(alpha) rmsd of the native structure), also known as the protein structure refinement problem. It has been shown that improvement could be achieved using in vacuo energy minimization with molecular mechanics and statistically derived continuously differentiable hybrid knowledge-based (KB) potential functions. Globular proteins, however, fold and function in aqueous solution in vivo and in vitro. In this work, we study the role of solvent in protein structure refinement. Molecular dynamics in explicit solvent and energy minimization in both explicit and implicit solvent were performed on a set of 75 native proteins to test the various energy potentials. A more stringent test for refinement was performed on 729 near-native decoys for each native protein. We use a powerfully convergent energy minimization method to show that implicit solvent (GBSA) provides greater improvement for some proteins than the KB potential: 24 of 75 proteins showing an average improvement of >20% in C(alpha) rmsd from the native structure with GBSA, compared to just 7 proteins with KB. Molecular dynamics in explicit solvent moved the structures further away from their native conformation than the initial, unrefined decoys. Implicit solvent gives rise to a deep, smooth potential energy attractor basin that pulls toward the native structure.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Showing the PC values averaged over the 729 decoys for each of the 75 proteins' energy minimization runs with KB in Encad (40, 41), GBSA implicit solvent in Tinker (32) and OPLS explicit solvent in Gromacs (–38). The 75 proteins are arranged in decreasing order of improvement for the KB potential. A negative value of PC corresponds to improvement in the wRMS value of the energy minimized structure compared with that of the starting structure. Best overall improvement is for KB with a mean overall PC (〈PC〉) of −11.56% compared with −4.0% for GBSA and −0.2% for OPLS. Of the 46 proteins improved by GBSA, 24 had a PC value better than −20%, compared with just 7 proteins with KB. Overall, GBSA performs better than KB for 30 of the 75 proteins. However, GBSA moved 29 proteins away from the native state, whereas KB moved just 3. The drawing beside the chart are colored by the B-factor (blue for a low value and red for a high value) to show the greater flexibility of loops and chain termini that is compensated by wRMS.
Fig. 2.
Fig. 2.
Comparison of the GDT-HA scores for energy minimization with KB (solid) and GBSA (dashed). (A) The change in GDT-HA between final minimized structure and the initial decoy GDT-HAfinal − GDT-HAinitial is plotted against GDT-HAinitial for all 729 decoys of each of the 6 proteins that show large negative percentage change (PC) for the KB and GBSA potentials. Values of GDT-HAfinal − GDT-HAinitial are averaged over GDT-HAinitial bins of width 0.1. A higher value of GDT-HAfinal − GDT-HAinitial indicates an improvement in the decoy (gets closer to the native structure). Overall 〈GDT-HAfinal − GDT-HAinitial〉 is higher for GBSA indicating that this method improves the decoy structure more than KB. (B) Showing 1pne, a good case with GBSA, that is less good with KB. The native structure of 1pne is shown in gray, the best decoy minimized with KB in red, and the best decoy minimized with GBSA in purple. GBSA gives a better match to the native structure than does KB. (C) Showing 1nkd, a good case with KB. The lowest GDT-HAinitial value is better (≈0.7) than for 1pne (≈0.5). For 1nkd, GBSA and KB do equally well.
Fig. 3.
Fig. 3.
Showing MD of decoys in explicit solvent compared to energy minimization with the GBSA and KB potentials. (A) The PC values (y axis) are shown for the sampled set of 20 proteins, sorted in ascending order and plotted against their rank in the sort. Note, that the indicator at rank 1 does not necessarily reference the same protein for every condition. Negative values of PC correspond to improvement of the structure. For the 20 proteins, KB gives the largest number of improved proteins. Some proteins run with GBSA have better PC values than for KB, whereas others do worse. There is almost no change for MD at 0 ps and additional MD moves the structure away from the native, in that the 200-ps curve is always above those of 100 ps and 0 ps. (B) Diagram of 1lwba showing most improvement for OPLS explicit solvent MD. Red, high-value B-factor; blue, low-value B-factor. (C) Diagram of 1lwba showing that, after 100 ps of MD (orange), the decoys moved closer to the native structure (gray) with a PC value of −23.2% ± 25.8% compared with −20.2% ± 27.9% at 200 ps (blue). (D) Diagram of 1h99a1, which shows least improvement for OPLS explicit solvent MD, colored by B-factor with high values in red and low value in blue. (E) Diagram of 1h99a1 shows how MD moves the structure away from the native structure (gray) as time progresses, with PC value of 75.3% ± 36.5% at 100 ps (orange) compared with 98.4% ± 40.3% at 200 ps (blue). Note the flexibility of the chain termini.
Fig. 4.
Fig. 4.
Showing directed movement on the potential energy surface for GBSA energy minimization. The green points mark the positions of the initial decoy structures, the red points mark the positions of the final energy minimized structures, and the line connecting them indicates the movement. These contoured projections of multidimensional energy surface are made by selecting a random subset of 30 decoys before and after minimization including the native structure (big pink disk) to construct a 61 × 61 matrix of pairwise wRMS value. The energy contours are filled with color that varies from blue for low energy to red for high energy. In this plot, points close on paper are generally also similar with a small rmsd value, whereas points far apart on paper are different with a large rmsd value. Good and bad cases are selected using the PC values for each of the protein shown in parenthesis. The good cases show an attractor basin; seen most clearly for 1kpta. For the bad cases, the arrangement of the hills and valleys separates the native structure from the major energy basin where all of the decoys end up after energy minimization. Multidimensional scaling is done using Graphviz to get the clearest 2D representation of the 61-dimensional space defined by the wRMS matrix. We generate the contour plots using the potential energy at each point and MATLAB's 4-point smoothing.
Fig. 5.
Fig. 5.
Comparing the directed movement on the potential energy surface for energy minimization with KB and GBSA and MD with OPLS in explicit solvent. The starting decoy structures are green points and the native structure is shown as a cyan disk. For KB and GBSA, the final energy minimized decoys are red points. For OPLS explicit MD, the pink points are the energy minimized structure at 0 ps, the blue points are the decoys at 100 ps and the red points are the decoys at 200 ps. These projections of the multidimensional energy surface are made using a random subset of 30 decoys for each protein. For KB and GBSA energy minimization, a 61 × 61 matrix of pairwise wRMS values is used (30 initial decoys, 30 corresponding energy minimized decoys, and the native structure). For OPLS explicit MD a 121 × 121 matrix is used (30 initial starting decoys; 30 corresponding decoys at 0 ps, 100 ps, and 200 ps, respectively; and the native structure). Good cases with KB energy minimization (1lvfa), GBSA energy minimization (1tml___2) and OPLS explicit MD (1lwba) are selected using the PC values for each of the protein (shown in parenthesis). For each of these 3 proteins, we compare directed movement on the energy surface caused by energy minimization with KB and GBSA and MD with OPLS. An attractor basin is seen for KB and GBSA energy minimization for all 3 proteins, but only with GBSA do the final decoys come very close to each other and to the native structure. For KB energy minimization of 1tml___2 and 1lwba, the energy surface contains many local minima near the native state preventing it from being reached; in both these cases a clear attractive basin is seen with GBSA. For the best case with OPLS explicit MD (1lwba), initial simulation moves the protein closer to the native structure but more simulation (100 to 200 ps) moves away again. For the less good cases with OPLS explicit MD (1lvfa and 1tml___2) simulation moves the structure all over the energy surface with the 200-ps points (red), father apart from the native structure than the 100-ps points (blue) or the initial decoy points (green).

References

    1. Lattman E. The state of the Protein Structure Initiative. Proteins. 2004;54:611–615. - PubMed
    1. Service RF. Protein Structure Initiative: Phase 3 or phase out. Science. 2008;319:1610–1613. - PubMed
    1. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. - PubMed
    1. Arakaki AK, Zhang Y, Skolnick J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics. 2004;20:1087–1096. - PubMed
    1. Wieman H, Tøndel K, Anderssen E, Drabløs F. Homology-based modelling of targets for rational drug design. Mini Rev Med Chem. 2004;4:793–804. - PubMed

Publication types

MeSH terms

LinkOut - more resources