Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 13;14(11):6102-6116.
doi: 10.1021/acs.jctc.8b00683. Epub 2018 Oct 8.

Template-Guided Protein Structure Prediction and Refinement Using Optimized Folding Landscape Force Fields

Affiliations

Template-Guided Protein Structure Prediction and Refinement Using Optimized Folding Landscape Force Fields

Mingchen Chen et al. J Chem Theory Comput. .

Abstract

When good structural templates can be identified, template-based modeling is the most reliable way to predict the tertiary structure of proteins. In this study, we combine template-based modeling with a realistic coarse-grained force field, AWSEM, that has been optimized using the principles of energy landscape theory. The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained force field having both transferable tertiary interactions and knowledge-based local-in-sequence interaction terms. We incorporate template information into AWSEM by introducing soft collective biases to the template structures, resulting in a model that we call AWSEM-Template. Structure prediction tests on eight targets, four of which are in the low sequence identity "twilight zone" of homology modeling, show that AWSEM-Template can achieve high-resolution structure prediction. Our results also confirm that using a combination of AWSEM and a template-guided potential leads to more accurate prediction of protein structures than simply using a template-guided potential alone. Free energy profile analyses demonstrate that the soft collective biases to the template effectively increase funneling toward native-like structures while still allowing significant flexibility so as to allow for correction of discrepancies between the target structure and the template. A further stage of refinement using all-atom molecular dynamics augmented with soft collective biases to the structures predicted by AWSEM-Template leads to a further improvement of both backbone and side-chain accuracy by maintaining sufficient flexibility but at the same time discouraging unproductive unfolding events often seen in unrestrained all-atom refinement simulations. The all-atom refinement simulations also reduce patches of frustration of the initial predictions. Some of the backbones found among the structures produced during the initial coarse-grained prediction step already have CE-RMSD values of less than 3 Å with 90% or more of the residues aligned to the experimentally solved structure for all targets. All-atom structures generated during the following all-atom refinement simulations, which started from coarse-grained structures that were chosen without reference to any knowledge about the native structure, have CE-RMSD values of less than 2.5 Å with 90% or more of the residues aligned for 6 out of 8 targets. Clustering low energy structures generated during the initial coarse-grained annealing picks out reliably structures that are within 1 Å of the best sampled structures in 5 out of 8 cases. After the all-atom refinement, structures that are within 1 Å of the best sampled structures can be selected using a simple algorithm based on energetic features alone in 7 out of 8 cases.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1.
Figure 1.
Protocol for structure prediction using AWSEM-Template and all-atom refinement.
Figure 2.
Figure 2.
Summary of the coarse-grained structure prediction results for eight proteins with different sequence identities to their templates. From left to right, the prediction targets are T0251, 1MBA, T0815, T0833, T0792, T0784, T0803, and T0766. Q, a measure of structural similarity, is plotted on the y-axis. The maximum Q values obtained during the AWSEM annealing simulations without template information (blue diamonds) and the AWSEM-Template annealing simulations (red diamonds) are plotted along with the Q values of the blindly selected AWSEM-Template structures (red squares). Blind selection of structures typically yields structures that are within ≈0.1 Q units of the best sampled structure. For comparison, the maximum Q values obtained from annealing simulations using a Hamiltonian having only the template bias and backbone geometry constraints are drawn in green diamonds.
Figure 3.
Figure 3.
Prediction quality and consistency for each of the eight structure prediction targets during the coarse-grained prediction step. Red diamonds correspond to AWSEM-Template predictions and blue diamonds correspond to AWSEM predictions without template information. Green diamonds correspond to predictions using a model that has only template information to guide tertiary structure prediction and lacks the transferable energy terms in the standard AWSEM model (Vcontact, Vburial, and VHB). In each case, the highest Q values from each of the 20 annealing simulations are plotted in ascending order. The alignments of the best sampled structures (those sampled structures that have the highest Q value) from the AWSEM-Template annealing simulations with the corresponding X-ray crystal structures are shown on the right side of each panel. The predicted structures are shown in red and the X-ray crystal structures are shown in white. Below the structures, several accuracy metrics are given including the Q value, the RMSD value, and the CE-RMSD value along with the percentage of aligned residues in the CE alignment.
Figure 4.
Figure 4.
Comparison of initial and refined structures of each protein. Left: Three dimensional alignment of initial, refined, and native structures. The initial structure (red) and the refined structure with the maximum Q value (blue) are aligned to the native structure (white). Middle: The contact maps comparing the initial and native structures (upper left) and comparing the refined and native structures (lower right). The contact maps are calculated based on the Cα atoms with the cutoff distance of 9.5 Å. Hollow black squares are native contacts. Filled red squares are false-positive contacts. Filled green squares are false-negative contacts. Right: Changes in the incorrectly predicted contacts during refinement. Here, hollow symbols indicate a decrease and filled symbols indicate an increase. The upper left corner shows the change of false-positive contacts, with red hollow squares indicating the loss of a false-positive contact and red filled squares indicating the gain of a false-positive contact. The lower right corner shows the changes in false-negative contacts, with green hollow squares indicating the loss of a false-negative contact and green filled squares indicating the gain of a false-negative contact.
Figure 5.
Figure 5.
Change of accuracy of side-chain packing during atomistic refinement. The accuracy was calculated for (A, “bAccuracy”) the residues whose normalized solvent accessible surface areas are less than 0.2 in the X-ray crystal structure and (B, “sAccuracy”) the residues whose normalized solvent accessible surface areas are greater than 0.5 in the X-ray crystal structure. The accuracy was evaluated based on the side-chain dihedrals χ1 (blue) and χ2 (orange) of each residue. The average (solid line) and standard deviation (shadow) were calculated over a total of 10 independent refinement simulations.
Figure 6.
Figure 6.
Number of minimally frustrated and highly frustrated interactions as a function of time during the all-atom refinement simulations. The average number of minimally frustrated interactions during the 10 all-atom refinement simulations is shown as a function of time in green for all prediction targets. The average number of highly frustrated interactions is shown in red. The standard deviations of these quantities are indicated with errorbars. For comparison, the number of minimally frustrated and highly frustrated interactions present in the corresponding crystal structures are shown as solid horizontal lines.
Figure 7.
Figure 7.
Soft constraints from templates guide the folding of T0792 toward a native-like state. Free energy profiles of T0792 as a function of Q are shown using different models: AWSEM without template information (blue circles), AWSEM-Template with soft constraints (red circles), and AWSEM-Template with strong constraints (yellow circles).
Figure 8.
Figure 8.
Expectation values of energy terms in the AWSEM-Template model as a function of Q for the protein T0792. The expectation values are shown for several different models: AWSEM without template information (blue circles), AWSEM-Template with soft constraints (red circles), and AWSEM-Template with strong constraints (yellow circles).

References

    1. Dill KA; MacCallum JL The Protein-Folding Problem, 50 Years On. Science 2012, 338, 1042–1046. - PubMed
    1. Schafer NP; Kim BL; Zheng W; Wolynes PG Learning To Fold Proteins Using Energy Landscape Theory. Isr. J. Chem 2014, 54, 1311–1337. - PMC - PubMed
    1. Wolynes PG Evolution, energy landscapes and the paradoxes of protein folding. Biochimie 2015, 119, 218–230. - PMC - PubMed
    1. Ferreiro DU; Komives EA; Wolynes PG Frustration in biomolecules. Q. Rev. Biophys 2014, 47, 285–363. - PMC - PubMed
    1. Goldstein RA; Luthey-Schulten ZA; Wolynes PG Optimal protein-folding codes from spin-glass theory. Proc. Natl. Acad. Sci. U.S. A 1992, 89, 4918–4922. - PMC - PubMed

LinkOut - more resources