Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 10;106(10):3734-9.
doi: 10.1073/pnas.0811363106. Epub 2009 Feb 23.

Mimicking the folding pathway to improve homology-free protein structure prediction

Affiliations

Mimicking the folding pathway to improve homology-free protein structure prediction

Joe DeBartolo et al. Proc Natl Acad Sci U S A. .

Abstract

Since the demonstration that the sequence of a protein encodes its structure, the prediction of structure from sequence remains an outstanding problem that impacts numerous scientific disciplines, including many genome projects. By iteratively fixing secondary structure assignments of residues during Monte Carlo simulations of folding, our coarse-grained model without information concerning homology or explicit side chains can outperform current homology-based secondary structure prediction methods for many proteins. The computationally rapid algorithm using only single (phi,psi) dihedral angle moves also generates tertiary structures of accuracy comparable with existing all-atom methods for many small proteins, particularly those with low homology. Hence, given appropriate search strategies and scoring functions, reduced representations can be used for accurately predicting secondary structure and providing 3D structures, thereby increasing the size of proteins approachable by homology-free methods and the accuracy of template methods that depend on a high-quality input secondary structure.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Interrelated themes of protein folding. Protein backbone motions, 2° structure and hydrogen bonding, and side-chain packing are the necessary components of any folding model. Secondary and 3° structure formation are coupled processes whereby the formation of each type of structure influences the formation of the other type.
Fig. 2.
Fig. 2.
ItFix 2° and 3°structure prediction protocol. At the end of each round, the 2° structure frequencies are used to eliminate H, E, or C when a frequency falls below a specified threshold.
Fig. 3.
Fig. 3.
Secondary structure prediction. ItFix predicts 2° structure at the Q8 level (H, E, CG, CN, CI, CS, CB, or CT). Results are displayed for the targets in Table 2, which have a large number of sequence homologs, making them ideal targets for the PSIPRED and SSPro homology-based prediction methods.
Fig. 4.
Fig. 4.
Tertiary structure determination. Alignments of the ItFix lowest observed rmsd 3° structure (dark) with the native structure (light) were made by using PyMol visualization software. Cα rmsd between ItFix model and native structure from column 8 of Table 2 is listed above each target.
Fig. 5.
Fig. 5.
ItFix algorithm mimics the experimentally determined ubiquitin folding pathway. The position dependence of the 2° structure frequencies at the end of each round, E (blue), H (red), and C (green), is shown. A single color bar represents a residue assigned to a single 2° structure type (native 2° structure shown at the top, along with long-range contacts). As the rounds progress, uncertainties in 2° structure diminish. The major steps in the proposed folding pathway (30, 31) are similar to the order of structure fixing over the multiple rounds: the hairpin forms, followed by the helix and β3 strand, and then β4. The final two events are the folding of the 3–10 helix and β5. Their formation appears in some trajectories but not at a high enough frequency to be fixed for the next round.

References

    1. Yooseph S, et al. The Sorcerer II Global Ocean Sampling expedition: Expanding the universe of protein families. PLoS Biol. 2007;5:e16. - PMC - PubMed
    1. Service RF. Problem solved* (*sort of) Science. 2008;321:784–786. - PubMed
    1. Altschul SF, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Pollastri G, Przybylski D, Rost B, Baldi P. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins. 2002;47:228–235. - PubMed
    1. Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol. 1999;292:195–202. - PubMed

Publication types

LinkOut - more resources