Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Mar 19:7:12.
doi: 10.1186/1472-6807-7-12.

Protein structure prediction by all-atom free-energy refinement

Affiliations

Protein structure prediction by all-atom free-energy refinement

Abhinav Verma et al. BMC Struct Biol. .

Abstract

Background: The reliable prediction of protein tertiary structure from the amino acid sequence remains challenging even for small proteins. We have developed an all-atom free-energy protein forcefield (PFF01) that we could use to fold several small proteins from completely extended conformations. Because the computational cost of de-novo folding studies rises steeply with system size, this approach is unsuitable for structure prediction purposes. We therefore investigate here a low-cost free-energy relaxation protocol for protein structure prediction that combines heuristic methods for model generation with all-atom free-energy relaxation in PFF01.

Results: We use PFF01 to rank and cluster the conformations for 32 proteins generated by ROSETTA. For 22/10 high-quality/low quality decoy sets we select near-native conformations with an average Calpha root mean square deviation of 3.03 A/6.04 A. The protocol incorporates an inherent reliability indicator that succeeds for 78% of the decoy sets. In over 90% of these cases near-native conformations are selected from the decoy set. This success rate is rationalized by the quality of the decoys and the selectivity of the PFF01 forcefield, which ranks near-native conformations an average 3.06 standard deviations below that of the relaxed decoys (Z-score).

Conclusion: All-atom free-energy relaxation with PFF01 emerges as a powerful low-cost approach toward generic de-novo protein structure prediction. The approach can be applied to large all-atom decoy sets of any origin and requires no preexisting structural information to identify the native conformation. The study provides evidence that a large class of proteins may be foldable by PFF01.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Energy relaxation. Each conformation in a decoy set (1r69) is relaxed from its starting conformation (top set) to the final conformation (bottom set), dashed lines correlate the starting and final conformations for a subset of the relaxation runs. The native conformation (red triangle) has the lowest energy and clustering of the fifty lowest energy decoys (blue triangles) predicts the native conformation. Inset: Energy relaxation leads to an enrichment of the near native decoys in the entire set of final conformations(blue set = 50 lowest energy decoys).
Figure 2
Figure 2
Z-scores. Z-scores for independently generated near-native conformations of of the proteins investigated. The Z-score is computed as the ratio of the energy difference between the near-native decoy to the mean of the decoy set, divided by the standard deviation of the letter. The relaxed energies were used for mean and standard deviation.
Figure 3
Figure 3
CαRMSD of predicted structures. The Cα RMS deviation between the best cluster and the experimental conformation for the high and low-quality decoy sets (ordered by protein size) shows no decrease of the prediction probability with protein size for the high quality decoy sets. Green bars indicate correct predictions, red bars prediction failures. Shaded bars indicate indecisive simulations (see text), where there was no 'largest cluster' in the top 50 decoys. We note that there is only one genuine prediction failure for the high-quality decoy sets. The horizontal dashed line indicates the acceptance threshold for correct predictions. The purple bars designate the deviations of the best decoys when the same clustering technique is applied with the original ROSETTA scoring function.
Figure 4
Figure 4
Top Row: Overlay of some predicted structures. The overlay of the predicted (red) and experimental conformation (green) documents the close agreement of conformations for 1orc[66] and 1ail[67]; only the backbone of the proteins is shown for clarity (generated with PYMOL [62]). Bottom row: Cβ-Cβ distance difference matrices: A pixel in row i and column j of the color coded distance map indicates the difference in the Cβ-Cβ distances of the native and the folded structure. Black (gray) squares indicate that the Cβ-Cβ distances of the native and the other structure differ by less than 1.5 (2.25)Å respectively. White squares indicate larger deviations. The data indicates clearly the presence of all short-range and long-range native contacts for the high-quality decoys and good agreement even for the predictions from low-quality decoy sets.
Figure 5
Figure 5
Enrichment. Enrichment of native conformations by PFF01: Fraction of near-native conformations in the top 50 decoys by energy (black) and by CαRMSD (red). The latter bar indicates the enrichment attainable by an 'ideal' scoring function.
Figure 6
Figure 6
Enrichment. Enrichment of the best decoys by CαRMSD, total energy in PFF01, and its components (Lennard-Jones, Solvation Energy, Hydrogen Bonding, Sidechain Electrostatics), Radius of Gyration and the clustering technique described in the text. The vertical axis counts the number of proteins in the database that yield a best decoy according to chosen criterion.
Figure 7
Figure 7
Comparison with DFIRE: RMSD of best energy (top row) and best cluster (bottom row) for the decoy sets by PFF01 (green bar) with DFIRE (red bar). The dashed lines indicate the averages over all decoy sets.

Similar articles

Cited by

References

    1. Baker D, Sali A. Protein Structure Prediction and Structural Genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. - DOI - PubMed
    1. Hardin C, Pogorelov T, Luthey-Schulten Z. Ab initio protein structure prediction. Curr Opin Struct Biol. 2002;12:176–181. doi: 10.1016/S0959-440X(02)00306-8. - DOI - PubMed
    1. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP) – Round 6. Proteins: Structure, Function, and Bioinformatics. 2005;61:3–7. doi: 10.1002/prot.20716. - DOI - PubMed
    1. Ginalski K, Grishin N, Godzik A, Rychlewski L. Practical lessons from protein structure prediction. Nucl Acids Res. 2005;33:1874–1891. doi: 10.1093/nar/gki327. - DOI - PMC - PubMed
    1. Bradley P, Misura KMS, Baker D. Toward High-Resolution de-novo Structure Prediction for small proteins. Science. 2005;309:1868–1871. doi: 10.1126/science.1113801. - DOI - PubMed

Publication types

Substances

LinkOut - more resources