Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Oct 2;109(40):16155-60.
doi: 10.1073/pnas.1207719109. Epub 2012 Sep 14.

Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy

Affiliations

Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy

Hagen Hofmann et al. Proc Natl Acad Sci U S A. .

Abstract

The dimensions of unfolded and intrinsically disordered proteins are highly dependent on their amino acid composition and solution conditions, especially salt and denaturant concentration. However, the quantitative implications of this behavior have remained unclear, largely because the effective theta-state, the central reference point for the underlying polymer collapse transition, has eluded experimental determination. Here, we used single-molecule fluorescence spectroscopy and two-focus correlation spectroscopy to determine the theta points for six different proteins. While the scaling exponents of all proteins converge to 0.62 ± 0.03 at high denaturant concentrations, as expected for a polymer in good solvent, the scaling regime in water strongly depends on sequence composition. The resulting average scaling exponent of 0.46 ± 0.05 for the four foldable protein sequences in our study suggests that the aqueous cellular milieu is close to effective theta conditions for unfolded proteins. In contrast, two intrinsically disordered proteins do not reach the Θ-point under any of our solvent conditions, which may reflect the optimization of their expanded state for the interactions with cellular partners. Sequence analyses based on our results imply that foldable sequences with more compact unfolded states are a more recent result of protein evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Structures and amino acid compositions of the proteins used in this study (A) and single-molecule FRET efficiency histograms for CspTm (Csp66, SI Appendix, Table S1) at different concentrations of GdmCl (B). (A) Mean net charge, including the charges of the attached fluorophores, versus mean hydrophobicity per residue for hCyp, CspTm, R15, R17, IN, and ProTα (variants ProT53 and ProT54, SI Appendix) (circles). Error bars are standard deviations of mean net charge and mean hydrophobicity of the different variants of each protein. The density plot represents the distribution of 10,905 monomeric proteins with a sequence similarity ≤ 30% taken from the Protein Data Bank. The horizontal dashed line indicates a mean net charge of zero. Diagonal dashed lines indicate the separation line between intrinsically disordered and folded proteins suggested by Uversky et al. (48).
Fig. 2.
Fig. 2.
Radius of gyration, RG, for all proteins and variants as a function of the number of bonds, Nbonds = N + l, at different GdmCl concentrations (see color scale). Each dye linker was estimated to be equivalent to 4.5 peptide bonds (l = 9) (61). Colored dashed lines are fits according to Eq. 3 with formula image. The contour plots represent the distribution of RG values for the folded proteins shown in A. Gray circles are the RG values determined for unfolded proteins via SAXS, taken from Kohn et al. (40). Open blue circles are RG values of denatured proteins under native conditions determined with SAXS, taken from Uzawa et al. (30). Black solid lines are fits of the data taken from Kohn et al. (40) and of the 10,905 monomeric native proteins from the Protein Data Bank with Eq. 3. The resulting scaling exponents are indicated.
Fig. 3.
Fig. 3.
Scaling exponents (A) and phase transition surface (B) for the unfolded proteins and variants of this study. (A) Error bars represent the uncertainties of the fits shown in Fig. 2, and the distributions in water (Left) and 6 M GdmCl (Right) reflect the changes in the scaling exponents upon variation of formula image by ± 10% around its estimated value of 0.40 nm. (B) Comparison between experimentally determined expansion factors α (filled circles) for all variants and proteins of this study and the numerically computed expansion factors α with our estimate for RGΘ using Eq. 1. Shaded volumes indicate the regimes of attractive (ε > 0) and repulsive (ε < 0) intrachain interaction energies. The gray shaded region indicates the transition regime between αc = 1, the critical value for infinitely long chains, and αc = 1 + (19/22)ϕ0, the approximation for finite chains as given by Sanchez (21). Here, ϕ0 is the volume fraction of the Θ-state relative to the most compact state (SI Appendix).
Fig. 4.
Fig. 4.
Comparison between the radii of gyration and the hydrodynamic radii for hCyp as a function of GdmCl activity. (A) Radius of gyration, RG, (blue circles) for Cyp163 (SI Appendix, Table S1) rescaled to the full length sequence (Nbonds = 166 + 9) according to the scaling laws shown in Fig. 2, and hydrodynamic radius (RH) determined from 2fFCS (red circles) for the donor-labeled variant CypV2C as a function of the denaturant activity, aGdmCl. Error bars for RG were estimated from the change in formula image by ± 10%. Error bars for RH represent the standard deviation of ± 0.1 nm estimated from the calibration of the instrument (SI Appendix). Solid lines are fits according to y = y(0) + γaGdmCl/(K + aGdmCl), where y is RG or RH, respectively. Inset: Arrangement of the foci with parallel and vertical polarization in the 2f-FCS setup (51). (B) RG/RH as a function of the GdmCl activity. Error bars result from the error propagation of the uncertainties shown in A. The solid line is the ratio of the fits shown in A.
Fig. 5.
Fig. 5.
Relative intrachain interaction energies, Δεtotal, as a function of GdmCl activity, and comparison between Δεtotal and Δgsol. (A) Δεtotal for the proteins of this study (circles, colors as in Fig. 3B) together with the fits according to the Schellman weak binding model (gray solid line), and, for comparison, the Tanford transfer free energies Δgsol calculated for the full-length sequences (black line) according to ref. . Contributions from the backbone and side chains to Δgsol are shaded in blue and green, respectively. The effect of the δgsol-values estimated for Glu and Asp on Δgsol is indicated as a light green shaded area. From the discrepancy between Δεtotal and Δgsol for ProTα, we obtained δgsol for Glu and Asp at 6 M GdmCl to be -798 cal mol-1 (SI Appendix, Eq. S14 and Table S2). (B) Correlation between Δεtotal and Δgsol and thermodynamic cycle (C) illustrating the effect of GdmCl on the chain energy as explained in the main text. State 1 is a hypothetical expanded unfolded state in water and state 3 is the same state in the presence of GdmCl. State 2 is the collapsed unfolded state in water.
Fig. 6.
Fig. 6.
Scaling exponents, sequence composition, and evolutionary trends. (A) Correlation between the scaling exponents of the proteins and the net charges of their sequences at pH 7. (B) Correlation between the scaling exponents of the six proteins and the mean hydrophobicity of their sequences. Horizontal error bars are the standard deviations as shown in Fig. 1A; vertical error bars reflect the changes in the scaling exponents upon variation of formula image by ± 10%. Dashed lines in A and B are global fits according to empirical equations chosen to give reasonable limits of ν (SI Appendix, Eq. S29). Insets: Frequency histograms of the predicted scaling exponents for the unfolded states of the proteins selected from the pdb shown in Fig. 1 A and B based on the fits in A (red) and B (blue), respectively. The shaded areas indicate the regime of scaling exponents between ν = 0.40 and ν = 0.51, which encompass 93% of proteins in A and 71% of proteins in B. (CE) Distributions of predicted scaling exponents (Top) and mean net charge versus hydrophobicity (Bottom) for 50,000 amino acid sequences drawn randomly from the amino acid frequency distribution of the last universal ancestor (C), current proteins (D), and predicted for the distant future (E). The mean scaling exponents are indicated. See SI Appendix, Eqs. S29S31 for calculation of the scaling exponents. Amino acid frequencies were taken from table 3 in ref. .

References

    1. Hagen SJ, Hofrichter J, Szabo A, Eaton WA. Diffusion-limited contact formation in unfolded cytochrome c: Estimating the maximum rate of protein folding. Proc Natl Acad Sci USA. 1996;93:11615–11617. - PMC - PubMed
    1. Bieri O, et al. The speed limit for protein folding measured by triplet–triplet energy transfer. Proc Natl Acad Sci USA. 1999;96:9597–9601. - PMC - PubMed
    1. Schuler B, Lipman E, Eaton W. Probing the free-energy surface for protein folding with single-molecule fluorescence spectroscopy. Nature. 2002;419:743–747. - PubMed
    1. Müller-Späth S, et al. Charge interactions can dominate the dimensions of intrinsically disordered proteins. Proc Natl Acad Sci USA. 2010;107:14609–14614. - PMC - PubMed
    1. Shoemaker B, Portman J, Wolynes P. Speeding molecular recognition by using the folding funnel: The fly-casting mechanism. Proc Natl Acad Sci USA. 2000;97:8868–8873. - PMC - PubMed

Publication types