Solvent dramatically affects protein structure refinement

Gaurav Chopra¹, Christopher M Summa, Michael Levitt

Affiliations

PMID: 19073921
PMCID: PMC2600579
DOI: 10.1073/pnas.0810818105

Solvent dramatically affects protein structure refinement

Gaurav Chopra et al. Proc Natl Acad Sci U S A. 2008.

. 2008 Dec 23;105(51):20239-44.

doi: 10.1073/pnas.0810818105. Epub 2008 Dec 10.

Authors

Gaurav Chopra¹, Christopher M Summa, Michael Levitt

Affiliation

¹ Department of Structural Biology, Stanford University, Stanford, CA 94305-5126, USA.

PMID: 19073921
PMCID: PMC2600579
DOI: 10.1073/pnas.0810818105

Abstract

One of the most challenging problems in protein structure prediction is improvement of homology models (structures within 1-3 A C(alpha) rmsd of the native structure), also known as the protein structure refinement problem. It has been shown that improvement could be achieved using in vacuo energy minimization with molecular mechanics and statistically derived continuously differentiable hybrid knowledge-based (KB) potential functions. Globular proteins, however, fold and function in aqueous solution in vivo and in vitro. In this work, we study the role of solvent in protein structure refinement. Molecular dynamics in explicit solvent and energy minimization in both explicit and implicit solvent were performed on a set of 75 native proteins to test the various energy potentials. A more stringent test for refinement was performed on 729 near-native decoys for each native protein. We use a powerfully convergent energy minimization method to show that implicit solvent (GBSA) provides greater improvement for some proteins than the KB potential: 24 of 75 proteins showing an average improvement of >20% in C(alpha) rmsd from the native structure with GBSA, compared to just 7 proteins with KB. Molecular dynamics in explicit solvent moved the structures further away from their native conformation than the initial, unrefined decoys. Implicit solvent gives rise to a deep, smooth potential energy attractor basin that pulls toward the native structure.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Fig. 1.**
Showing the PC values averaged over the 729 decoys for each of the 75 proteins' energy minimization runs with KB in Encad (40, 41), GBSA implicit solvent in Tinker (32) and OPLS explicit solvent in Gromacs (–38). The 75 proteins are arranged in decreasing order of improvement for the KB potential. A negative value of PC corresponds to improvement in the wRMS value of the energy minimized structure compared with that of the starting structure. Best overall improvement is for KB with a mean overall PC (〈PC〉) of −11.56% compared with −4.0% for GBSA and −0.2% for OPLS. Of the 46 proteins improved by GBSA, 24 had a PC value better than −20%, compared with just 7 proteins with KB. Overall, GBSA performs better than KB for 30 of the 75 proteins. However, GBSA moved 29 proteins away from the native state, whereas KB moved just 3. The drawing beside the chart are colored by the B-factor (blue for a low value and red for a high value) to show the greater flexibility of loops and chain termini that is compensated by wRMS.

**Fig. 2.**
Comparison of the *GDT-HA* scores for energy minimization with KB (solid) and GBSA (dashed). (A) The change in *GDT-HA* between final minimized structure and the initial decoy GDT-HA_final − GDT-HA_initial is plotted against GDT-HA_initial for all 729 decoys of each of the 6 proteins that show large negative percentage change (PC) for the KB and GBSA potentials. Values of GDT-HA_final − GDT-HA_initial are averaged over GDT-HA_initial bins of width 0.1. A higher value of GDT-HA_final − GDT-HA_initial indicates an improvement in the decoy (gets closer to the native structure). Overall 〈GDT-HA_final − GDT-HA_initial〉 is higher for GBSA indicating that this method improves the decoy structure more than KB. (B) Showing 1pne, a good case with GBSA, that is less good with KB. The native structure of 1pne is shown in gray, the best decoy minimized with KB in red, and the best decoy minimized with GBSA in purple. GBSA gives a better match to the native structure than does KB. (C) Showing 1nkd, a good case with KB. The lowest GDT-HA_initial value is better (≈0.7) than for 1pne (≈0.5). For 1nkd, GBSA and KB do equally well.

**Fig. 3.**
Showing MD of decoys in explicit solvent compared to energy minimization with the GBSA and KB potentials. (A) The PC values (y axis) are shown for the sampled set of 20 proteins, sorted in ascending order and plotted against their rank in the sort. Note, that the indicator at rank 1 does not necessarily reference the same protein for every condition. Negative values of PC correspond to improvement of the structure. For the 20 proteins, KB gives the largest number of improved proteins. Some proteins run with GBSA have better PC values than for KB, whereas others do worse. There is almost no change for MD at 0 ps and additional MD moves the structure away from the native, in that the 200-ps curve is always above those of 100 ps and 0 ps. (B) Diagram of 1lwba showing most improvement for OPLS explicit solvent MD. Red, high-value B-factor; blue, low-value B-factor. (C) Diagram of 1lwba showing that, after 100 ps of MD (orange), the decoys moved closer to the native structure (gray) with a PC value of −23.2% ± 25.8% compared with −20.2% ± 27.9% at 200 ps (blue). (D) Diagram of 1h99a1, which shows least improvement for OPLS explicit solvent MD, colored by B-factor with high values in red and low value in blue. (E) Diagram of 1h99a1 shows how MD moves the structure away from the native structure (gray) as time progresses, with PC value of 75.3% ± 36.5% at 100 ps (orange) compared with 98.4% ± 40.3% at 200 ps (blue). Note the flexibility of the chain termini.

**Fig. 4.**
Showing directed movement on the potential energy surface for GBSA energy minimization. The green points mark the positions of the initial decoy structures, the red points mark the positions of the final energy minimized structures, and the line connecting them indicates the movement. These contoured projections of multidimensional energy surface are made by selecting a random subset of 30 decoys before and after minimization including the native structure (big pink disk) to construct a 61 × 61 matrix of pairwise wRMS value. The energy contours are filled with color that varies from blue for low energy to red for high energy. In this plot, points close on paper are generally also similar with a small rmsd value, whereas points far apart on paper are different with a large rmsd value. Good and bad cases are selected using the PC values for each of the protein shown in parenthesis. The good cases show an attractor basin; seen most clearly for 1kpta. For the bad cases, the arrangement of the hills and valleys separates the native structure from the major energy basin where all of the decoys end up after energy minimization. Multidimensional scaling is done using Graphviz to get the clearest 2D representation of the 61-dimensional space defined by the wRMS matrix. We generate the contour plots using the potential energy at each point and MATLAB's 4-point smoothing.

**Fig. 5.**
Comparing the directed movement on the potential energy surface for energy minimization with KB and GBSA and MD with OPLS in explicit solvent. The starting decoy structures are green points and the native structure is shown as a cyan disk. For KB and GBSA, the final energy minimized decoys are red points. For OPLS explicit MD, the pink points are the energy minimized structure at 0 ps, the blue points are the decoys at 100 ps and the red points are the decoys at 200 ps. These projections of the multidimensional energy surface are made using a random subset of 30 decoys for each protein. For KB and GBSA energy minimization, a 61 × 61 matrix of pairwise wRMS values is used (30 initial decoys, 30 corresponding energy minimized decoys, and the native structure). For OPLS explicit MD a 121 × 121 matrix is used (30 initial starting decoys; 30 corresponding decoys at 0 ps, 100 ps, and 200 ps, respectively; and the native structure). Good cases with KB energy minimization (1lvfa), GBSA energy minimization (1tml___2) and OPLS explicit MD (1lwba) are selected using the PC values for each of the protein (shown in parenthesis). For each of these 3 proteins, we compare directed movement on the energy surface caused by energy minimization with KB and GBSA and MD with OPLS. An attractor basin is seen for KB and GBSA energy minimization for all 3 proteins, but only with GBSA do the final decoys come very close to each other and to the native structure. For KB energy minimization of 1tml___2 and 1lwba, the energy surface contains many local minima near the native state preventing it from being reached; in both these cases a clear attractive basin is seen with GBSA. For the best case with OPLS explicit MD (1lwba), initial simulation moves the protein closer to the native structure but more simulation (100 to 200 ps) moves away again. For the less good cases with OPLS explicit MD (1lvfa and 1tml___2) simulation moves the structure all over the energy surface with the 200-ps points (red), father apart from the native structure than the 100-ps points (blue) or the initial decoy points (green).

See this image and copyright information in PMC

References

1. Lattman E. The state of the Protein Structure Initiative. Proteins. 2004;54:611–615. - PubMed
1. Service RF. Protein Structure Initiative: Phase 3 or phase out. Science. 2008;319:1610–1613. - PubMed
1. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. - PubMed
1. Arakaki AK, Zhang Y, Skolnick J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics. 2004;20:1087–1096. - PubMed
1. Wieman H, Tøndel K, Anderssen E, Drabløs F. Homology-based modelling of targets for rational drug design. Mini Rev Med Chem. 2004;4:793–804. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Solvent dramatically affects protein structure refinement

Affiliation

Solvent dramatically affects protein structure refinement

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources