Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep;633(8028):232-239.
doi: 10.1038/s41586-024-07784-4. Epub 2024 Aug 7.

The ribosome lowers the entropic penalty of protein folding

Affiliations

The ribosome lowers the entropic penalty of protein folding

Julian O Streit et al. Nature. 2024 Sep.

Abstract

Most proteins fold during biosynthesis on the ribosome1, and co-translational folding energetics, pathways and outcomes of many proteins have been found to differ considerably from those in refolding studies2-10. The origin of this folding modulation by the ribosome has remained unknown. Here we have determined atomistic structures of the unfolded state of a model protein on and off the ribosome, which reveal that the ribosome structurally expands the unfolded nascent chain and increases its solvation, resulting in its entropic destabilization relative to the peptide chain in isolation. Quantitative 19F NMR experiments confirm that this destabilization reduces the entropic penalty of folding by up to 30 kcal mol-1 and promotes formation of partially folded intermediates on the ribosome, an observation that extends to other protein domains and is obligate for some proteins to acquire their active conformation. The thermodynamic effects also contribute to the ribosome protecting the nascent chain from mutation-induced unfolding, which suggests a crucial role of the ribosome in supporting protein evolution. By correlating nascent chain structure and dynamics to their folding energetics and post-translational outcomes, our findings establish the physical basis of the distinct thermodynamics of co-translational protein folding.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The ribosome modulates the conformational ensemble of the unfolded state.
a, Exemplar regions of a 1H−15N HMQC NMR spectrum of isolated FLN5 A3A3 spin-labelled at C740 (right) and FLN5+31 A3A3 C740 (left) with the paramagnetic and diamagnetic spectrum overlaid. NMR data were recorded at 800 MHz, 283 K. b, PRE-NMR intensity ratio profiles (intensity in the paragmagnetic, Ipara, and diamagnetic, Idia, spectrum; fitted mean ± root mean square error (RMSE) propagated from spectral noise) for isolated FLN5 A3A3 spin-labelled at C740 (black) and FLN5+31 A3A3 C740 (RNC; blue). Theoretical reference profiles expected for a fully extended polypeptide are shown as dashed lines (Methods). The secondary structure elements (β-strands) of native FLN5 are indicated at the top. The shaded region at the C terminus represents the region of FLN5 that is broadened beyond detection owing to ribosome interactions (N730–K746, in the RNC). c, Representative ensembles of unfolded FLN5 on and off the ribosome. The structures shown represent the top 50 cluster centroids (to scale; clustering details in Methods) after reweighting with the PRE-NMR data. d, Top 10 individual structural clusters on and off the ribosome labelled with their respective population (not to scale). e, Distributions of the radius of gyration (Rg) for both ensembles for the residues belonging to the FLN5 domain (M637–G750). The shaded area represents the s.e.m. estimated from block averaging. f,g, Average inter-residue contact maps of isolated (f) and ribosome-bound (g) FLN5 (zoomed to a probability of 0.1 for ease of visualization). The black contours represent contacts formed in the native state of FLN5. h, Average inter-residue long-range (defined here as a separation of at least ten residues) contact probability along the protein sequence (shaded regions represent s.e.m. from block averaging). Source data
Fig. 2
Fig. 2. The unfolded state is entropically destabilized on the ribosome.
a, The local radius of gyration along the sequence (21-residue moving average). The shaded regions correspond to residues K646–S650 (left), G660–K680 (middle) and G700–N740 (right). b, Difference (RNC-iso) in the total conformational entropy using 50 bins (Methods). The s.e.m. was estimated from block averaging (using a 7.5 μs sampling block size). Bars are coloured according to the gradient in d. c, Difference in the maximum theoretical SASA of the unfolded state (RNC-iso; Methods). The s.e.m. was estimated from block averaging. The bars are coloured according to the gradient in e. d, Entropy difference between the RNC and isolated ensembles mapped onto representative ensembles of FLN5+31 A3A3. e, Changes in SASA between the RNC and isolated ensembles is mapped onto the representative ensembles of FLN5+31 A3A3. f, Bar chart (mean ± s.e.m.) summarizing the energetic changes between the unfolded state on and off the ribosome (RNC-iso) at 298 K. All quantities are estimated based on the molecular dynamics ensemble averages, except the ‘ribosome binding’ contribution, which was experimentally determined. Errors were combined from block averaging and empirical parameter uncertainties (Methods). g, 19F NMR spectra of the folding equilibrium of FLN5 labelled with 4-trifluoromethyl-l-phenylalanine (tfmF) at position 655 on and off the ribosome at 288 K and 298 K. Native (N), unfolded (U) and intermediate states on (I1, I2) and off (Iiso) the ribosome are indicated. h, Temperature dependence of the folding equilibrium constant (Keq) of FLN5 on and off the ribosome measured by 19F NMR (mean ± s.e.m.). Data were fit to a modified Gibbs–Helmholtz equation (Methods). l, Thermodynamic parameters (mean ± s.d. from fits, T = 298 K) calculated from the nonlinear fit in h. Source data
Fig. 3
Fig. 3. Entropic destabilization persists at long linker lengths and leads to stable coTF intermediates.
a, PRE-NMR intensity ratio profiles are shown for the C740 labelling site (black star) window averaged over three residues for isolated FLN5 A3A3, FLN5+31 A3A3 and FLN5+67 A3A3 as in Fig. 1b. b, Left, temperature dependence of the folding equilibrium constants of isolated FLN5Δ6, FLN5+34 (wild-type) and FLN5+67 measured by 19F NMR (mean ± s.e.m.) fit to a modified Gibbs–Helmholtz equation (Methods). The error bars of individual datapoints are similar in magnitude to the size of the circles. Right, Thermodynamic parameters (mean ± s.d., T = 298 K) calculated from the nonlinear fits and shown as the difference relative to the isolated protein. c, Folding free energies (mean ± s.e.m. propagated from NMR lineshape fits) of the coTF intermediates I1 and I2 at different linker lengths (x axis) (ref. and Extended Data Fig. 8o). The folding free energy of the isolated FLN5Δ6 intermediate is shown as a horizontal line. The contributions to the stabilization of the intermediates on the ribosome due to ribosome binding (ΔΔGbinding = RTln(1 − pB), where pB is fraction bound) and entropy (ΔΔGentropy = ΔΔGI-U,RNC-iso − ΔΔGbinding) are shown as vertical bars. The ribosome-bound population was estimated using a τR,bound (rotational correlation time of the bound state) of 3,003 ns (Sbound2 = 1.0; order parameter of the bound state). d, Model of the free energy landscape of folding on and off the ribosome. The unfolded (U) state is destabilized on the ribosome relative to in isolation, outcompeted by stabilizing ribosome interactions. I2 is stabilized by less than 0.1 kcal mol−1 on the ribosome owing to interactions (see c) from its folding free energy of at least ∆GI(iso)-U(iso) (that is, the stability of Iiso, owing to the structural similarity between Iiso and I2; lower bound estimate). 19F NMR has shown that the native state is destabilized relative to U. Source data
Fig. 4
Fig. 4. Co-translational folding intermediates of titin I27 and HRAS are stabilized on the ribosome.
a,b, Crystal structures of titin I27 (Protein Data Bank (PDB): 1TIT) (a) and HRAS (PDB: 4Q21) (b). c, 19F NMR spectra of titin I27, tfmF-labelled at position 14, off the ribosome (isolated), isolated I27 in urea, and I27 tethered with a 34-residue linker to the ribosome (I27+34 RNC) recorded at 298 K. d, 19F NMR spectra of the HRAS G-domain (residues 1–166), tfmF-labelled at position 32, off the ribosome (isolated), in urea and HRAS on the ribosome arrested at 3 different lengths and recorded at 298 K. HRAS1–81, HRAS1–189 and HRAS1–189+20GS correspond to residues 1–81 of HRAS, full-length HRAS and full-length HRAS with an additional 20-residue poly(glycine-serine) linker, respectively, each tethered to the ribosome by the arrest-enhanced SecM motif. e,g, Nonlinear fits to a modified Gibbs–Helmholtz equation (Methods) of the equilibrium constants (mean ± s.e.m. propagated from NMR lineshape fits) of titin I27 folding off and on the ribosome (e) and the HRAS1-81 RNC (g) measured by 19F NMR. f, Thermodynamic parameters determined from the nonlinear fit (mean ± s.d., T = 298 K) of the temperature dependence of I27 folding. The heat capacity of folding (ΔCp,N-U) obtained for the isolated and ribosome-bound protein are −2.0 ± 0.1 and +0.6 ± 1.7 kcal mol−1 K−1, respectively. The heat capacity of the isolated protein is similar to the literature value reported for the wild-type variant (−1.4 kcal mol−1). h, Thermodynamic parameters determined from the nonlinear fit (mean ± s.d., T = 298 K) of the temperature dependence of folding for HRAS1–81. The heat capacity of folding obtained from the fit is −0.4 ± 0.3 kcal mol−1 K−1. Source data
Fig. 5
Fig. 5. Destabilizing mutations are buffered by the ribosome.
a, Folding free energies of destabilizing mutants off and on the ribosome (FLN5+34 and FLN5+67 depending on the stability of the mutant) determined by 19F NMR from U and N state populations. b, Destabilization (ΔΔGN-U,mut-WT) of all mutants in isolation compared to the RNC. WT, wild type. c, Destabilization of 4 mutants in isolation, on the ribosome (FLN5+67) and on the ribosome in the presence of 2.5 M urea. d, Left, temperature dependence of the folding equilibrium constant involving the N and U state of isolated FLN5Δ6, FLN5+34 and FLN5+34 in 1.5 M urea measured by 19F NMR fit to a modified Gibbs–Helmholtz equation (Methods). The error bars of individual datapoints (propagated from bootstrapped errors of NMR lineshape analyses) are similar in magnitude to the size of the circles. Right, thermodynamic parameters (T = 298 K) from the nonlinear fits (mean ± s.d.) shown as the difference relative to the isolated protein. Unless stated otherwise, all values in the figure represent the mean ± s.e.m. propagated from NMR lineshape fits. Source data
Fig. 6
Fig. 6. The ribosome re-balances the enthalpy–entropy compensation of protein folding.
The unfolded state is entropically destabilized owing to a lower conformational entropy and increased solvation, whereas the native state is enthalpically destabilized. This results in a reduction of the entropic penalty for folding and a less favourable folding enthalpy. The labels in grey and black indicate weak and strong energetic contributions, respectively.
Extended Data Fig. 1
Extended Data Fig. 1. PRE analysis of unfolded FLN5 on and off the ribosome.
(A) Schematics of the constructs used for the PRE experiments. The RNC is comprised of an N-terminal His-tag (for purification), FLN5 A3A3, the subsequent domain FLN6, and an enhanced version of the SecM-AE1 stalling sequence,. The FLN5 A3A3 mutant was previously described. (B) (Left) The annotated crystal structure (PDB 1QFH) is shown from two views towards the two main β-sheets, highlighting the PRE labelling sites used for both the isolated protein and the RNC. (Right) The secondary structure of folded FLN5 and labelling sites are shown. (C) Region of an exemplar 1H-15N HMQC NMR spectrum of isolated FLN5 A3A3 spin-labelled at C657 (see Supplementary Fig. 1 for full spectrum). The paramagnetic and diamagnetic spectrum are overlayed. (D) PRE intensity ratio profiles for six different labelling sites (indicated with the black star) on (blue) and off (black) the ribosome. NMR data were recorded at 800 MHz, 283 K. Theoretical reference profiles expected for a fully extended polypeptide are also shown as dashed lines (see methods). The secondary structure elements (β-strands) of native FLN5 are indicated at the top. The shaded region at the C-terminus represents the region of FLN5 that is broadening beyond detection through ribosome interactions (N730-K746, in the RNC). The second panel with grey bars under each dataset shows the difference between the RNC and isolated data. (E) (Top) The annotated crystal structure (PDB 1QFH) of FLN5 is shown with two additional labelling sites used for the RNC construct. (Bottom) Annotated MTSL labelling sites (yellow circles) on the ribosome structure near the exit tunnel. (F) PRE intensity ratio profiles for the two addition labelling sites within FLN5 A3A3 and two ribosomal MTSL labelling sites recorded at 800 MHz, 283 K. All data show the fitted mean NMR intensities ± RMSE propagated from spectral noise. See Supplementary Fig. 1 for NMR spectra. Source data
Extended Data Fig. 2
Extended Data Fig. 2. MTSL labelling, quality control and optimisation of PRE-NMR experiments.
(A-B) Mass spectrometry analysis of MTSL-labelled FLN5 A3A3 cysteine variants C699 V747 (A) and C744 V747 (B). Black arrows indicate the mass of unlabelled FLN5 A3A3 and red arrows the mass of MTSL-labelled protein. (C) Fluorescent gel (12% BisTris) of purified 70 S and RNC (FLN5 + 31 A3A3 C699 V747) samples labelled with a fluorescent MTSL analogue (ABD-MTS) at pH 8.0 for the indicated time. The gel shows a distinct band for the NC in addition to the ribosome background. Ribosomal proteins are also annotated based on molecular weight estimates. The experiment was performed three times (n = 3) and a representative gel image is shown (see supplementary information, Supplementary Fig. 2 for uncropped gel images). (D) Representative anti-hexahistidine western blot (12% BisTris gel) of FLN5 + 31 A3A3 V747 with a cysteine at C699 and C744 during reaction time-course with molar excess (10000x) of PEG maleimide at pH 7.5 to probe the accessibility and reactivity of the cysteine variants. The fraction PEGylated (mean ± SD; n = 2 for C699; n = 3 for C744) was estimated by densitometry and plotted as a function of time (see supplementary information, Supplementary Fig. 3 for uncropped gel images). (E) A representative Coomassie and fluorescent gel (20% Tricine) of purified WT, L23 G90C and L24 N53C 70 S ribosomes after overnight incubation with 10x molar excess fluorescein maleimide at pH 7.5. (See supplementary information, Supplementary Fig. 4 for uncropped gel images; n = 2 for L23 G90C; n = 3 for WT and L24 N53C). (F) PRE intensity ratios of the FLN5 + 31 A3A3 variant without any cysteines in the NC (C747V, Δcys). (G) Chemical shift perturbations (CSPs) along the protein sequence for all MTSL-labelled isolated protein (upper row) and RNC (lower row) variants measured in the 1H-15N SOFAST-HMQC spectra of FLN5 + 31 A3A3 RNC cysteine variants relative to the isolated FLN5 A3A3 protein and the FLN5 + 31 A3A3 RNC, respectively. The labelling sites are indicated with a star (*). The dotted line indicates a threshold of 0.06 ppm. (H) Integrity of RNCs during PRE experiments was monitored with 15N-SORDID diffusion measurements. The calculated diffusion coefficient D is shown throughout NMR acquisition (centre), highlighting the paramagnetic (grey) and the diamagnetic acquisition timeframe (red). (I) Optimisation of the recycle delay (d1) time chosen for PRE SOFAST-HMQC experiments to provide maximum sensitivity while also allowing the signal to relax completely before the subsequent scan is initiated. 1D 1H spectra at d1 values ranging from 50-800 ms (top, yellow to red gradient); total signal intensity dependence on the d1 value (middle); time-averaged signal (bottom). 450 ms was chosen for PRE experiments. (J) Diffusion coefficients of the DSS reference and isolated FLN5 A3A3 in different concentrations of glycerol. The extracted radius of hydration (Rh) for the protein is also shown. The values at 5% and 18% of glycerol were calculated taking into account the increase in viscosity from the DSS diffusion measurements. (K) PRE analysis of isolated FLN5 A3A3 C740 V747 in different concentrations of glycerol. The upper panel shows all individual datapoints while the lower panel shows the data averaged over a window of three residues for ease of visualisation. (L) Theoretical effect of increasing viscosity on the PRE intensity ratios (Ipara/Idia). The upper panel shows the predicted PRE profile of the FLN5 A3A3 ensemble obtained after reweighting using different values of τC (shown in legend in nanoseconds) and the lower panel shows an overlay of the experimental data at 0 and 18% glycerol with the MD profiles using τC of 3 and 12 ns. (M) Theoretical effect of increasing residue-specific τC values towards the C-terminus for a tethered polymer, using Eq. S16 and SNC2 = (1/d) x SNC,max2 where d is the distance to the C-terminal residue (in amino acids) and SNC,max2 the maximum order parameter that the C-terminal residue can reach (set to 0.1 for this illustrative example). The top plot shows the experimental RNC PRE-NMR data and isolated PREs (computed from the reweighted MD ensemble) with either a uniform τC of 3 ns across the sequence or the tethering τC values from the panel below. Unless otherwise indicated, all NMR data are presented as the fitted mean ± RMSE propagated from the spectral noise. Source data
Extended Data Fig. 3
Extended Data Fig. 3. Analysis and reweighting of MD simulations for isolated FLN5 A3A3.
(A) Probability distributions of the all-atom radius of gyration (Rg) for the different ensembles (mean ± SEM from block averaging). (B) Probability distributions of the fraction of native contacts (Q, relative to natively folded FLN5, mean ± SEM from block averaging). (C) Ensemble-averaged properties including Rg, Q and secondary structure populations are summarised (mean ± SEM from block averaging). (D) Average secondary structure propensities (mean ± SEM from block averaging) along the protein sequence determined using the DSSP algorithm (C = coil, E = strand, H = helix). The vertical shaded areas highlight the regions of β-strands (annotated as strands A-G) in natively folded FLN5. (E) Average contact maps of the ensembles (zoomed in to a probability of 0.2 for clarity). Contacts were defined as Cα- Cα distances of less than 10 Å. The black contours highlight the native contact map of folded FLN5. Above and below the diagonal are identical. (F) Overlay of experimental data (shown in transparent orange bars) with the calculated PREs of the four ensemble before and after (H) reweighting. Colours are as in panels A-B. (G) Determination of optimal τC for each ensemble by computing the reduced χ2 statistic against the experimental PRE-NMR data (Extended Data Fig. 1). Values of τC were scanned in steps of 1 ns from 1 to 15 ns and the optimal value found is displayed in the figure legend. Colours are as in panels A-B. (I) L-curve analysis to identify an optimal balance between the prior ensemble and agreement with experimental data. The entropy term on the x-axis represents the Kullback-Leibler divergence and quantifies the extent of deviation from the prior ensemble. The optimal value of τc as determined from the prior ensemble as well as the χ2, RMSD and Neff (fraction of effective frames contributing to the ensemble average calculated as ln(-Entropy)) are displayed in each panel for the corresponding elbow of the L-curve, which is the final solution chosen from the reweighting analysis (see methods). Source data
Extended Data Fig. 4
Extended Data Fig. 4. Validation of the ensembles against orthogonal data not used in the reweighting process.
(A) Diffusion coefficients (mean ± RMSE propagated from NMR intensity fits) and radius of hydration (Rh) (see methods) as measured for folded FLN5, FLN5 A3A3 and the unfolded state of FLN5Δ6, a previously characterised truncation variant. (B) Comparison between the experimental Rh (32.6 ± 0.1 Å, plotted as a horizontal line in magenta) and the calculated Rh of the ensembles before (black bar) and after (yellow bar) reweighting. The error bars represent the uncertainty around the ensemble average expected from the forward model (see methods). The right panel shows the corresponding χ2 values, quantifying the agreement with the experimental data. (C) Secondary Cα chemical shifts of FLN5 A3A3 using the random coil shifts predicted by POTENCI. (D) Comparison between experimental and calculated chemical shifts from the MD ensembles before (black bars) and after (yellow bar) reweighting for each nucleus. The table above the plot summarises a global agreement score, calculated by adding the nucleus specific RMSD values normalised by the error of the forward model. The forward model error is plotted as a horizontal line in the bar plots, taken as the RMSE values reported by the method. (E) Comparison between the experimental RDCs (grey bars) measured in PEG/octanol with the simulated RDCs before (dotted line) and after reweighting with the PRE data (solid line). The RDC Q-factors are used to quantify the agreement. (F) Guinier region and linear fit (red line) to the experimental SAXS data (black circles). The bottom plot shows the residuals. (G) Experimental SAXS profile shown as a double log plot (mean ± errors propagated as determined by the ATSAS package). (H) Ensemble-averaged Rg values obtained from the MD ensembles before (prior) and after reweighting (posterior) compared with the experimental value from the Guinier analysis in panel F, obtained with the autorg tool, and the molecular form factor (MFF) analysis. (I) Comparison of the experimental and theoretical SAXS profiles obtained from the MD ensembles before and after reweighting. The goodness of fit is quantified with the reduced χ2 and residuals are shown below the main plot for the prior and posterior ensembles. (J) CD spectrum of isolated FLN5 A3A3 recorded at 283 K. (K) Secondary structure populations obtained from the NMR chemical shifts with δ2D compared with average populations observed in the MD ensembles before (in parantheses) and after reweighting (mean ± SEM from block averaging). Source data
Extended Data Fig. 5
Extended Data Fig. 5. Analysis of unfolded state ensemble on the ribosome obtained from all-atom MD simulations.
(A) Modelling of MTSL rotamer distribution on ribosome labelling sites uL23 G90C and uL24 N53C. Ten E. coli ribosome PDB models (highest resolution models available to date: 4YBB, 6PJ6, 6XZ7, 7K00, 7LVK, 7N1P, 7O1A, 7PJS, 7Z20, 7ZP8) were aligned to the simulation ribosome frame in PyMOL (v2.3). For each ribosome model, MTSL rotamers were fitted to the labelling sites as described in methods. The transparent cloud represents the rotamer cloud from these ten ribosome models, highlighting how small fluctuations in the labelling site can lead to different rotamer distributions. R1 represents the rotamer distribution fitted to the ribosome model utilised in the all-atom MD simulations, while R2 is the rotamer distribution fitted to the ribosome model utilised in our previous work. We find the RNC ensembles to be in better agreement after reweighting with the R2 rotamer distribution compared to the R1 distribution and used the R2 distribution for the results presented here. (B) Bayesian reweighting of the FLN5 + 31 A3A3 RNC ensemble using the experimental PRE data is shown (see methods). The final χ2 and Neff obtained at the elbow of the curve are shown on the plot. (C) Comparison of back-calculated PREs from MD and the experimental data (black bars, Extended Data Fig. 1) before (dotted blue line) and after reweighting (solid blue line). (D) Secondary Cα chemical shifts of FLN5 + 31 A3A3 measured at 283 K using the POTENCI random coil values. (E) Average agreement (reported as the RMSD in ppm) between MD (calculated) and experimental chemical shifts before (black) and after (yellow) reweighting with the PRE data. The dotted horizontal line represents the error of the forward model. (F) β-strand secondary structure propensity (mean ± SEM from block averaging). (G) NC interactions with the ribosome mapped onto the surface of the ribosome. (H) Left: Interactions between the NC and ribosome surface along the protein sequence (mean ± SEM from block averaging). The black cross indicates the experimentally estimated interaction for the C-terminal binding site (within the dotted rectangle) from our previous work. Right: A comparison of amide S2 order parameters from MD simulations with relative NMR intensities further supports the accuracy of NC-ribosome interactions observed in the MD simulations. The decrease in NMR intensities towards the C-terminus around residue 720 coincides with an increase in the amide S2 (restricted dynamics due to ribosome binding). A steric-only model (see methods) does not predict this increase correctly, only showing an increase in the amide S2 around at ~residue 740. (I-J) The residue-specific interaction contributions from Lennard-Jones (LJ) and Coulombic energies (mean ± SEM from block averaging) of the N-terminal (I) and C-terminal (J) ribosome-binding segments are shown. Ribosome interactions are driven by positively charged C-terminal residues (R734, K739, K746) with the rRNA and E749 interacting with RNA-bound Mg2+ ions and K47 within the uL24 loop. (K) Analysis of intramolecular contacts within FLN5 A3A3 on and off the ribosome between different types of residues (oppositely charged and hydrophobic). (L-M) Probability distributions of the FLN5 A3A3 steric-only model on and off the ribosome and comparison between the steric-only model and C36m+W ensemble of the NC-ribosome interaction probability along the FLN5 sequence (mean ± SEM from block averaging). (N) Rg and (O) SASA probability distributions for isolated and RNC FLN5 A3A3 before reweighting (prior) and after reweighting with different datasets (see Supplementary Tables 2–4). Source data
Extended Data Fig. 6
Extended Data Fig. 6. Entropy analysis of the unfolded state on and off the ribosome.
(A) Convergence of the number of clusters visited (see methods for clustering details) for several different cut-off values was assessed by plotting number of clusters as a function of simulation time. This confirmed that for the higher cut-off values (1.4–1.8 nm), sampling has been sufficient to reach a plateau in the number of clusters visited. This was analysed to ensure that differences between the RNC and isolated protein are not due to differences in sampling. (B) The average Gibbs entropy (inpi×lnpi, where n is the number of clusters/microstates and p the population of each microstate) was then estimated from the full ensembles after reweighting with the PRE data. (C) and (D) show the same analysis as in panels A-B but for a simple all-atom steric model of the unfolded state (see Methods). (E) Exemplar Ramachandran free energy landscapes of A721 on and off the ribosome. (F) The average entropy (S) summed over all residues for each ensemble is shown (mean ± SEM from block averaging). The average difference per residue is shown above the plot. Structures were sampled every 20 ps with equal statistical weights (to avoid differences due to differences in reweighting between the ensembles). (G) The resulting effect on free energy (−TΔS for the entire protein at 298 K, mean ± SEM) was calculated using different block sizes of total sampling and number of bins (legend of plot). We observe a convergence towards +1.9 ± 0.2 kcal mol−1 (estimated from 7.5 μs sampling and 50 bins). (H) Asphericity (Δ, see methods) of the ensembles shown as probability distributions (mean ± SEM from block averaging). (I) Probability distribution (mean ± SEM from block averaging) of the total (i), apolar (ii) and polar (iii) solvent-accessible surface area (SASA) of FLN5 (residues 646–750) is shown for each ensemble. (iv) The thermodynamic parameters of the solvation free energy difference between the unfolded state on and off the ribosome were calculated based on the apolar and polar changes in surface area and experimentally-parameterised functions of the heat capacity, Cp, entropy, S, and enthalpy, H,, (see methods for more details). (J) Average radial distribution function of the protein (all atoms) to water (centre of mass) distance for the isolated and RNC ensemble. The vertical line represents the 3.5 Å distance cut-off chosen to define the hydration layer consisting of the first and second hydration shell. (K) Probability distributions of the number of water molecules in the first hydration layer before (dashed line) and after (solid line) reweighting with PRE-NMR data and (L) ensemble-averaged number of water molecules in the hydration layer (mean ± SEM from block averaging). (M) Molar water entropy of obtained with the two-phase thermodynamic method (2PT) as a function of distance from the FLN5 A3A3 protein at 283 K for both the C36m and C36m+W parameters (which differ only in their water hydrogen LJ parameter). The horizontal line represents the bulk molar entropy of water obtained from a pure water box at 283 K (panel O). The solvation entropy (Ssolv) is the difference of the molar entropy of water in the hydration layer (0–3.5 Å) and in bulk (36–46 Å value used). Values are shown as mean ± SEM obtained from five independent simulations (n = 5, see Methods). (N) Molar water entropy as a function of distance from the FLN5 A3A3 protein with the C36m+W force field at 283 and 298 K (mean ± SEM from n = 5). Their respective bulk values obtained from pure water boxes (panel P) are shown as horizontal lines. (O) Comparison of molar entropy of water obtained from experiments, in previous work in the literature with the TIP3P water model, and values obtained in this work with C36m and C36m+W at 298 K (mean ± SEM form n = 5). (P) Difference in solvation entropy on and off the ribosome (RNC-isolated, mean ± SEM) obtained by using the solvation entropies per water molecule from panel N and difference in the number of water molecules in the hydration shells of the RNC and isolated ensemble (see methods). This quantity is shown for the ensembles before (prior) and after (posterior) reweighting with PRE-NMR data. Source data
Extended Data Fig. 7
Extended Data Fig. 7. Dependence of the folding equilibrium constant on temperature and structural perturbations observed in the native state on the ribosome.
(A-B) 19F NMR spectra of FLN5 on and off (Δ6 truncation) the ribosome recorded at a 19F-Larmor frequency of 470 MHz. Raw spectra are shown in grey, lineshape fits in colour and the total fit in black. Residuals after fitting are shown below each spectrum. (C-D) Nonlinear fits to a modified Gibbs-Helmholtz equation (see methods) of the equilibrium constants on and off the ribosome measured by 19F NMR (from panels A-B) shown as the mean ± SEM propagated from NMR line shape fits (panel C) and the resulting thermodynamic parameters (mean ± SD from fits, panel D). (E-F) 19F NMR spectra of the FLN5 mutant E6 on and off the ribosome (Δ2 truncation) recorded at a 19F-Larmor frequency of 470 MHz. The FLN5Δ2 E6 was chosen due to its suitable stability in this temperature range to quantify both [U] and [N]. Raw spectra are shown in grey, lineshape fits in colour and the total fit in black. Residues after fitting are shown below each spectrum. (G-H) Nonlinear fits to a modified Gibbs-Helmholtz equation (see methods) of the equilibrium constants on and off the ribosome measured by 19F NMR (from panels E-F) shown as the mean ± SEM propagated from NMR line shape fits (panel G) and the resulting thermodynamic parameters (mean ± SD from fits, panel H). (I) Left: Chemical shift perturbations (CSPs) measured by NMR (1H-13C HMQC) for methyl groups of natively folded FLN5 (RNCs relative to the isolated protein). The black datapoints represent the mean ± SD from five different RNC lengths for ease for visualisation. Right: Average CSPs mapped on the crystal structure of FLN5. (J) CSPs (RNC relative to isolated protein) measured for FLN5 labelled with three different 19F-tfmF labelling sites by 19F NMR at linker lengths of 47 and 67 amino acids. (K) Correlation plots (along with Pearson correlation coefficients) of methyl relaxation parameters (Saxis2τC) for natively folded FLN5 in different concentrations of glycerol (left panel) and correlating FLN5 on and off the ribosome (right panel). Source data
Extended Data Fig. 8
Extended Data Fig. 8. Expansion and entropic destabilisation of the unfolded state on the ribosome persist at longer NC linker lengths.
(A-C) PRE-NMR analysis of FLN5 A3A3 (labelled at C740, black star) in isolation and at three different RNC linker lengths (FLN5 + 31, FLN5 + 47, FLN5 + 67). Panel A shows a window average over three residues for ease of visualisation. Panels B and C show all datapoints as the fitted mean ± RMSE propagated from spectral noise. The colour scheme in panels B-C is the same as in panel A. Theoretical reference profiles expected for a fully extended polypeptide are also shown as dashed lines. The shaded region at the C-terminus represents the region of FLN5 that is broadening beyond detection through ribosome interactions (N730-K746, in the RNC). (D-E) 19F NMR spectra of FLN5 (F672A) on and off the ribosome recorded at a 19F-Larmor frequency of 470 MHz. A destabilising variant (F672A) is used to enable measurements of the unfolded state populations at FLN5 + 67. Raw spectra are shown in grey, lineshape fits in colour and the total fit in black. Residuals after fitting are shown below each spectrum. (F) Nonlinear fit to a modified Gibbs-Helmholtz equation of the equilibrium constants on and off the ribosome measured by 19F NMR (mean ± SEM propagated from NMR line shape fits). (G) Thermodynamic parameters estimated from the nonlinear fits in panel F (mean ± SD). FLN5 F672A and FLN5 Δ6 have indistinguishable thermodynamics, validating 672A as a pseudo wild-type system. (H) Nonlinear fit to a modified Gibbs-Helmholtz equation of the equilibrium constants (all constants relative to the unfolded state) on and off the ribosome measured by 19F NMR (mean ± SEM propagated from NMR line shape fits). (I) Thermodynamic parameters estimated from the nonlinear fits in panel H (mean ± SD). (J) Transverse relaxation rate (R2) measurements of isolated full-length (FL) FLN5 labelled at position 655 with tfmF recorded at a 19F-Larmor frequency of 470 MHz and 298 K. (K) 1D 19F NMR spectra of isolated, full-length FLN5 in different concentrations of glycerol, fitted spectra in blue, raw spectra in grey. (L) Fitting of R2 rates for FL-FLN5 in different concentrations of glycerol. (M) Correlation between measured R2 rates (panel L) and those obtained from the linewidths of the peaks in the 1D spectra (panel K). Points are shown as the mean ± SEM propagated from NMR line shape fits. (N) Correlation between the 19F linewidth/R2 rate obtained from line shape fitting (mean ± SEM) and previously determined rotational correlation times of FLN5 in different concentrations of glycerol. (O) 1D 19F NMR spectrum of FLN5 + 47 used in panel (P). (P) Estimated populations of coTF intermediates I1 and I2 bound to the ribosome based on the experimental 19F linewidth at 298 K and linear correlation between linewidth and rotational correlation time (panel N). The ribosome-bound populations were estimated with an Sbound2=1.0 (τR,bound = 3003 ns) and are shown as the mean ± SEM propagated from fitted NMR linewidths. Source data
Extended Data Fig. 9
Extended Data Fig. 9. Co- and post-translational folding thermodynamics of I27 and HRAS.
(A) 19F NMR spectra of isolated titin I27 (F73A variant), (B) titin I27 + 34 RNC, (C) titin I27 + 34 W34E RNC (a fully unfolded variant) and (D) HRAS1-81 on the ribosome recorded at different temperatures (at a 19F-Larmor frequency of 470 MHz). (E) The linewidths of all four states in the wild-type and unfolded state of the mutant I27 + 34 RNC are shown as the mean ± SEM from fitted NMR lineshapes. (F) 19F NMR spectrum of HRAS1-81 on the ribosome with two destabilising mutations V8E/V14E recorded at 298 K and a 19F-Larmor frequency of 470 MHz. Analysis of the NMR data in the time domain (as described in ref. ) shows that the fit is better for a single state compared to two states for the mutant (BIC = 6,897 and BIC = 6,894, respectively). Wild-type HRAS1-81 fits better to two states than a single state (BIC = 17,900 and BIC = 17,721, respectively). The right panel shows the linewidths of the two states in wild-type HRAS1-81 (Fig. 4d) and the mutant shown here. The bars represent the mean ± SEM from fitted NMR lineshapes. (G) HRAS GDP/GTP nucleotide exchange assay (schematic on top shows exchange from GDP- to GTP-bound state for RNC, released (control) and refolded HRAS). The plot shows the GDP/GTP exchange activity (mean ± SEM) from three independent refolding reactions (n = 3). We measured the activity as the maximum signal/noise fluorescence ratio obtained relative to buffer (see Methods). Values of ≤ 1 signify no activity. (H) Pulse proteolysis experiments of refolded and native (control) HRAS. The proteolytic stability of HRAS was assayed with thermolysin (see schematic on top). Exemplar western blots are shown and densitometry analyses from three independent refolding repeats (n = 3) are globally fit to an exponential decay with the obtained degradation rate indicated on the plot (mean ± SD from fitted parameters are shown). See Supplementary Fig. 6 for uncropped gel images. (I) Pulse proteolysis experiments (with thermolysin) of refolded (R) and native (control, C) HRAS in rabbit reticulocyte lysate (RRL). Exemplar western blots are shown comparing relative refolded/GDP band intensities at 0 and 9 h time points. Densitometry analyses (mean ± SEM) with n = 3 for the 0, 2 and 5 h time points and n = 2 refolding reactions for the 9 h time point are shown in the bottom bar plot. See Supplementary Fig. 7 for uncropped gel images. (J) 1H-15N SOFAST-HMQC NMR spectra of refolded and native (control) HRAS for two independent refolding reactions (left and right, recorded at 298 K and 700 and 800 MHz, respectively). The chemical shift perturbations (CSPs) and signal intensities (mean ± RMSE obtained from spectral noise) of refolded relative to native HRAS are shown below the spectra. The shaded grey areas highlight switch regions 1 and 2, respectively, and the relative signal intensities are also coloured on the HRAS structure (PDB 4Q21). Source data
Extended Data Fig. 10
Extended Data Fig. 10. NMR analyses of destabilising FLN5 mutants on and off the ribosome.
All data were recorded at a 1H-Larmor frequency of 500 MHz (19F-Larmor frequency of 470 MHz), 298 K. (A) Mutations mapped on the structure of FLN5. (B) 19F NMR spectra of wild-type and mutant FLN5 RNCs. The spectrum of FLN5 + 34 P742A was previously reported. (C) 19F NMR spectra of wild-type and four mutant FLN5 RNCs in the presence of 2.5 M Urea. The spectral noise was used to estimate the maximum population of the native state to calculate a lower bound of its folding free energy in urea. (D) 19F NMR translational diffusion experiment on FLN5 + 67 672 A RNC in 2.5 M urea to monitor the integrity of the sample in urea. The diffusion coefficient does not change significantly throughout the course of the NMR experiment and is consistent with a ribosome-bound species. (E) 19F NMR spectra of the FLN5 + 34 RNC in 1.5 M urea at different temperatures recorded at a 19F-Larmor frequency of 470 MHz. Raw spectra are shown in grey, lineshape fits in colour and the total fit in black. Residuals after fitting are shown below each spectrum. (F) Nonlinear fits to a modified Gibbs-Helmholtz equation for FLN5 + 34 in 1.5 M urea and isolated FLN5Δ6 as a reference. Values are shown as the mean ± SEM propagated from NMR line shape fits. (G) The resulting thermodynamic parameters including the ones of FLN5 + 34 without urea (−urea) for reference are shown as mean ± SD obtained from the fits. (H) 19F NMR spectra of wild-type and mutant FLN5 in isolation. Stabilities were quantified from the unfolded and folded state populations under native conditions, and where 3.5 M urea was used to quantify the stability of less destabilising variants relative to wild-type (assuming a constant m-value). (I) 1H-15N SOFAST-HMQC spectra of mutant FLN5 variants in isolation (purple) overlaid with wild-type (black). The chemical shift perturbations (CSPs) are mapped onto the crystal structure of FLN5. The thermodynamic stability and CSPs of isolated FLN5 variants P742A and E6 were previously reported and characterised,. Source data

References

    1. Cassaignau, A. M. E., Cabrita, L. D. & Christodoulou, J. How does the ribosome fold the proteome? Annu. Rev. Biochem.89, 389–415 (2020). - PubMed
    1. Ahn, M. et al. Modulating co-translational protein folding by rational design and ribosome engineering. Nat. Commun.13, 4243 (2022). - PMC - PubMed
    1. Holtkamp, W. et al. Cotranslational protein folding on the ribosome monitored in real time. Science350, 1104–1107 (2015). - PubMed
    1. Plessa, E. et al. Nascent chains can form co-translational folding intermediates that promote post-translational folding outcomes in a disease-causing protein. Nat. Commun.12, 6447 (2021). - PMC - PubMed
    1. Kaiser, C. M., Goldman, D. H., Chodera, J. D., Tinoco, I. Jr. & Bustamante, C. The ribosome modulates nascent protein folding. Science334, 1723–1727 (2011). - PMC - PubMed

LinkOut - more resources