Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Sep 16;100(19):10712-7.
doi: 10.1073/pnas.1931882100. Epub 2003 Sep 8.

Coarse-grained sequences for protein folding and design

Affiliations

Coarse-grained sequences for protein folding and design

Scott Brown et al. Proc Natl Acad Sci U S A. .

Abstract

We present the results of sequence design on our off-lattice minimalist model in which no specification of native-state tertiary contacts is needed. We start with a sequence that adopts a target topology and build on it through sequence mutation to produce new sequences that comprise distinct members within a target fold class. In this work, we use the alpha/beta ubiquitin fold class and design two new sequences that, when characterized through folding simulations, reproduce the differences in folding mechanism seen experimentally for proteins L and G. The primary implication of this work is that patterning of hydrophobic and hydrophilic residues is the physical origin for the success of relative contact-order descriptions of folding, and that these physics-based potentials provide a predictive connection between free energy landscapes and amino acid sequence (the original protein folding problem). We present results of the sequence mapping from a 20- to the three-letter code for determining a sequence that folds into the WW domain topology to illustrate future extensions to protein design.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Comparison of thermodynamic data for the new optimized sequences of proteins L and G to that of original L/G sequence. (a) Heat capacity Cv vs. temperature T for the original nonoptimal L/G, L, and G sequences. The new L and G sequences show increased cooperativity. (b) Fraction of chains in the native state, PNat, vs. temperature, T, for the L/G, L, and G sequences. At the folding temperature, Tf, the distribution of population between the folded and unfolded states is equal, that is, PNat(Tf) = 0.5.
Fig. 2.
Fig. 2.
Thermodynamic measures of the formation of native-state secondary structure. (a) Average native-state similarity of the α helix 〈χα〉 vs. temperature T. Stability of the α helix is reduced for new L and G sequences compared with original L/G sequence. (b) Average native-state similarity of the first (N-terminal) β-sheet region 〈χβ1〉 vs. temperature T. Stability of β1 region is reduced for new L and G sequences relative to original L/G sequence. Note that the stability is reduced further for G than for L. (c) Average native-state similarity of the second (C-terminal) β-sheet region 〈χβ2〉 vs. temperature T. Stability of β2 region is similar for L/G and G sequences but reduced for the L sequence.
Fig. 3.
Fig. 3.
Projection of free-energy surfaces onto order parameters χβ1 and χβ2. (a) Free-energy contour plot for protein L as a function of native-state similarity of the second (C-terminal) β-sheet region χβ2 and first (N-terminal) β-sheet region χβ1 at the folding temperature. Note the minimum free-energy path connecting the unfolded and folded ensembles proceeds through a transition state in which the β1 region is native and the β2 region is largely disrupted. (b) Free-energy contour plot for protein G as a function of native-state similarity of the second (C-terminal) β-sheet region χβ2 and first (N-terminal) β-sheet region χβ1 at the folding temperature. For G, the minimum free-energy path connecting the unfolded and folded ensembles proceeds through a transition state in which the β2 region is native-like and the β1 region is disrupted. Contour lines are spaced kBT apart.
Fig. 4.
Fig. 4.
Kinetic data with fits for proteins L and G. (a) Fraction of unfolded states Punfold as a function of time t for protein L at the folding temperature. The best fit of the data is to a single exponential. (b) Fraction of unfolded states Punfold as a function of time t for protein G at the folding temperature. The best fit for these data is to a double exponential. Fit parameters are given in Table 4.
Fig. 5.
Fig. 5.
Minimalist model of the native state topology for Pin WW domain (Right) and the NMR solution structure (Left), showing the very similar arrangement of secondary and tertiary structure.

Similar articles

Cited by

References

    1. Shea, J. E., Onuchic, J. N. & Brooks, C. L. (1999) Proc. Natl. Acad. Sci. USA 96, 12512–12517. - PMC - PubMed
    1. Onuchic, J. N., Lutheyschulten, Z. & Wolynes, P. G. (1997) Annu. Rev. Phys. Chem. 48, 545–600. - PubMed
    1. Go, N. (1983) Annu. Rev. Biophys. Bioeng. 12, 183–210. - PubMed
    1. Koga, N. & Takada, S. (2001) J. Mol. Biol. 313, 171–180. - PubMed
    1. Plaxco, K. W., Simons, K. T., Ruczinski, I. & David, B. (2000) Biochemistry 39, 11177–11183. - PubMed

Publication types

LinkOut - more resources