Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008:37:289-316.
doi: 10.1146/annurev.biophys.37.092707.153558.

The protein folding problem

Affiliations
Review

The protein folding problem

Ken A Dill et al. Annu Rev Biophys. 2008.

Abstract

The "protein folding problem" consists of three closely related puzzles: (a) What is the folding code? (b) What is the folding mechanism? (c) Can we predict the native structure of a protein from its amino acid sequence? Once regarded as a grand challenge, protein folding has seen great progress in recent years. Now, foldable proteins and nonbiological polymers are being designed routinely and moving toward successful applications. The structures of small proteins are now often well predicted by computer methods. And, there is now a testable explanation for how a protein can fold so quickly: A protein solves its large global optimization problem as a series of smaller local optimization problems, growing and assembling the native structure from peptide fragments, local structures first.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Binary code. Experiments show that a primarily binary hydrophobic-polar code is sufficient to fold helix-bundle proteins (112). Reprinted from Reference with permission from AAAS. (b) Compactness stabilizes secondary structure, in proteins, from lattice models. (c) Experiments supporting panel b, showing that compactness correlates with secondary structure content in nonnative states of many different proteins (218). Reprinted from Reference with permission.
Figure 2
Figure 2
(a) A novel protein fold, called Top7, designed by Kuhlman et al. (129). Designed molecule (blue) and the experimental structure determined subsequently (red). From Reference ; reprinted with permission from AAAS. (b) Three-helix bundle foldamers have been made using nonbiological backbones (peptoids, i.e., N-substituted glycines). (c) Their denaturation by alcohols indicates they have hydrophobic cores characteristic of a folded molecule (134).
Figure 3
Figure 3
Progress in protein structure prediction in CASP1–5 (219). The y-axis contains the GDT TS score, the percentage of model residues that can be superimposed on the true native structure, averaged over four resolutions from 1 to 8 Å (100% is perfect). The x-axis is the ranked target difficulty, measured by sequence and structural similarities to proteins in the PDB at the time of the respective CASP. This shows that protein structure prediction on easy targets is quite good and is improving for targets of intermediate difficulty. Reprinted from Reference with permission.
Figure 4
Figure 4
Folding rate versus relative contact order (a measure of localness of contacts in the native structure) for the 48 two-state proteins given in Reference , showing that proteins with the most local contacts fold faster than proteins with more nonlocal contacts.
Figure 5
Figure 5
One type of energy landscape cartoon: the free energy F(φ, ϕ, κ) of the bond degrees of freedom. These pictures give a sort of simplified schematic diagram, useful for illustrating a protein’s partition function and density of states. (a) A smooth energy landscape for a fast folder, (b) a rugged energy landscape with kinetic traps, (c) a golf course energy landscape in which folding is dominated by diffusional conformational search, and (d) a moat landscape, where folding must pass through an obligatory intermediate.
Figure 6
Figure 6
The experimentally determined energy landscape of the seven ankyrin repeats of the Notch receptor (16, 157, 209). The energy landscape is constructed by measuring the stabilities of folded fragments for a series of overlapping modular repeats. Each horizontal tier presents the partially folded fragments with the same number of repeats. Reprinted from Reference with permission.
Figure 7
Figure 7
(Left) The density of states (DOS) cartoonized as an energy landscape for the three-helix bundle protein F13W*: DOS (x-axis) versus the energy (y-axis). (Right) Denaturation predictions versus experiments (146). The peak free energy (here, where the DOS is minimum), typically taken to be the transition state, is energetically very close to native.
Figure 8
Figure 8
(a) A simple single-pathway system. R, reactant; TS, transition state; P, product. (b) Pathway diagram of SH3 folding from a master-equation model. The native protein has five contact clusters: A = RT loop; B = β2β3; C = β3β4; D = RT loop-β4; and E = β1β5. Combined letters, such as BD, mean that multiple contact clusters have formed. Funneling occurs toward the right, because the symbols on the left indicate large ensembles, whereas the symbols on the right are smaller ensembles. The numbers indicate free energies relative to the denatured state. The arrows between the states are colored to indicate transition times between states. The slowest steps are in red; the fastest steps in green. BD is the transition state ensemble because it is the highest free energy along the dominant route. While B and BD would seem to be obligatorily in series, the time evolutions of these states show that they actually rise and fall in parallel (232).
Figure 9
Figure 9
The folding routes found in the ZAM conformational search process for protein G, from the work described in Reference . The chain is first parsed into many short, overlapping fragments. After sampling by replica exchange molecular dynamics, stable hydrophobic contacts are identified and restrained. Fragments are then either (a) grown or zipped though iterations of adding new residues, sampling, and contact detection, or (b) assembled together pairwise using rigid body alignment followed by further sampling until a completed structure is reached.
Figure 10
Figure 10
Ribbon diagrams of the predicted protein structures using the ZAM search algorithm (purple) versus experimental PDB structures (orange). The backbone C-RMSDs with respect to PDB structures are protein A (1.9 Å), albumin domain binding protein (2.4 Å), alpha-3D [2.85 Å (excluding loop residues) or 4.6 Å], FBP26 WW domain (2.2 Å), YJQ8 WW domain (2.0 Å), 1–35 residue fragment of Ubiquitin (2.0 Å), protein G (1.6 Å), and -spectrin SH3 (2.2 Å). ZAM fails to find the src-SH3 structure: Shown is a conformation that is 6 Å from the experimental structure. The problem in this case appears to be overstabilization of nonnative ion pairs in the GB/SA implicit solvation model.

Similar articles

  • Folding of proteins with diverse folds.
    Mohanty S, Hansmann UH. Mohanty S, et al. Biophys J. 2006 Nov 15;91(10):3573-8. doi: 10.1529/biophysj.106.087668. Epub 2006 Sep 1. Biophys J. 2006. PMID: 16950845 Free PMC article.
  • The protein-folding problem, 50 years on.
    Dill KA, MacCallum JL. Dill KA, et al. Science. 2012 Nov 23;338(6110):1042-6. doi: 10.1126/science.1219021. Science. 2012. PMID: 23180855 Review.
  • Protein folding is a convergent problem!
    Das Gupta D, Kaushik R, Jayaram B. Das Gupta D, et al. Biochem Biophys Res Commun. 2016 Nov 25;480(4):741-744. doi: 10.1016/j.bbrc.2016.10.119. Epub 2016 Oct 28. Biochem Biophys Res Commun. 2016. PMID: 27983988
  • Mechanisms of protein folding.
    Ivarsson Y, Travaglini-Allocatelli C, Brunori M, Gianni S. Ivarsson Y, et al. Eur Biophys J. 2008 Jul;37(6):721-8. doi: 10.1007/s00249-007-0256-x. Epub 2008 Jan 9. Eur Biophys J. 2008. PMID: 18185928 Review.
  • Predicting protein folding pathways.
    Zaki MJ, Nadimpally V, Bardhan D, Bystroff C. Zaki MJ, et al. Bioinformatics. 2004 Aug 4;20 Suppl 1:i386-93. doi: 10.1093/bioinformatics/bth935. Bioinformatics. 2004. PMID: 15262824

Cited by

References

    1. So much more to know. . . Science. 2005;309:78–102. - PubMed
    1. Allen F, Coteus P, Crumley P, Curioni A, Denneau M, et al. Blue gene: a vision for protein science using a petaflop supercomputer. IBM Syst J. 2001;40:310–27.
    1. Anfinsen CB. Principles that govern the folding of protein chains. Science. 1973;181:223–30. - PubMed
    1. Anfinsen CB, Scheraga HA. Experimental and theoretical aspects of protein folding. Adv Protein Chem. 1975;29:205–300. - PubMed
    1. Auton M, Holthauzen LM, Bolen DW. Anatomy of energetic changes accompanying urea-induced protein denaturation. Proc Natl Acad Sci USA. 2007;104:15317–22. - PMC - PubMed

Publication types

LinkOut - more resources