Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 21;134(7):075103.
doi: 10.1063/1.3519056.

Constrained proper sampling of conformations of transition state ensemble of protein folding

Affiliations

Constrained proper sampling of conformations of transition state ensemble of protein folding

Ming Lin et al. J Chem Phys. .

Abstract

Characterizing the conformations of protein in the transition state ensemble (TSE) is important for studying protein folding. A promising approach pioneered by Vendruscolo et al. [Nature (London) 409, 641 (2001)] to study TSE is to generate conformations that satisfy all constraints imposed by the experimentally measured φ values that provide information about the native likeness of the transition states. Faísca et al. [J. Chem. Phys. 129, 095108 (2008)] generated conformations of TSE based on the criterion that, starting from a TS conformation, the probabilities of folding and unfolding are about equal through Markov Chain Monte Carlo (MCMC) simulations. In this study, we use the technique of constrained sequential Monte Carlo method [Lin et al., J. Chem. Phys. 129, 094101 (2008); Zhang et al. Proteins 66, 61 (2007)] to generate TSE conformations of acylphosphatase of 98 residues that satisfy the φ-value constraints, as well as the criterion that each conformation has a folding probability of 0.5 by Monte Carlo simulations. We adopt a two stage process and first generate 5000 contact maps satisfying the φ-value constraints. Each contact map is then used to generate 1000 properly weighted conformations. After clustering similar conformations, we obtain a set of properly weighted samples of 4185 candidate clusters. Representative conformation of each of these cluster is then selected and 50 runs of Markov chain Monte Carlo (MCMC) simulation are carried using a regrowth move set. We then select a subset of 1501 conformations that have equal probabilities to fold and to unfold as the set of TSE. These 1501 samples characterize well the distribution of transition state ensemble conformations of acylphosphatase. Compared with previous studies, our approach can access much wider conformational space and can objectively generate conformations that satisfy the φ-value constraints and the criterion of 0.5 folding probability without bias. In contrast to previous studies, our results show that transition state conformations are very diverse and are far from nativelike when measured in cartesian root-mean-square deviation (cRMSD): the average cRMSD between TSE conformations and the native structure is 9.4 Å for this short protein, instead of 6 Å reported in previous studies. In addition, we found that the average fraction of native contacts in the TSE is 0.37, with enrichment in native-like β-sheets and a shortage of long range contacts, suggesting such contacts form at a later stage of folding. We further calculate the first passage time of folding of TSE conformations through calculation of physical time associated with the regrowth moves in MCMC simulation through mapping such moves to a Markovian state model, whose transition time was obtained by Langevin dynamics simulations. Our results indicate that despite the large structural diversity of the TSE, they are characterized by similar folding time. Our approach is general and can be used to study TSE in other macromolecules.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Reproducing ϕ values of acylphosphatase (AcP). Experimentally measured ϕ values (Ref. 4) and calculated ϕ values obtained from conformation samples properly weighted with respect to the uniform distribution in Ωϕ are shown.
Figure 2
Figure 2
Defining folded and unfolded states and selecting conformations with 0.5 probability of folding. (a) The thresholds (vertical dashed lines) of the fraction of native contacts preserved for the folded and unfolded states. (b) Number counts of Markov chain Monte Carlo runs that reach the folded state for the set of 4185 conformations. Each point represents one conformation. Only the conformations between the two horizontal lines are included in TSE.
Figure 3
Figure 3
The distributions of cRMSD values of conformations satisfying the ϕ-value constraints and the distributions of the fraction of native contacts preserved. The distributions of cRMSDs for (a) the transitions state ensemble ΩTSE of 1,501 clusters of conformations, (c) the denatured-side ensemble ΩDS, and (e) the native-side ensemble ΩNS to the native conformation of protein acylphosphatase at different cRMSD distance intervals, and the distributions of the fraction of native contacts preserved at different intervals for (b) ΩTSE, (d) ΩDS, and (f) ΩNS. Both unweighted (white bar) and weighted (black bar) distributions are shown.
Figure 4
Figure 4
Lack of correlation between energy and cRMSD of conformations in the TSE.
Figure 5
Figure 5
The distributions of cRMSD between the secondary structures in the weighted TS conformations and in the native state of protein acylphosphatase. (a) Helix α1 (residues 22–33, white bar) and helix α2 (residues 55–66, black bars); (b) Strand β3 (residues 46–53, white bar), strand β4 (residues 77–85, black bars).
Figure 6
Figure 6
The recovery of overall ϕ-values and resulting larger cRMSD values of conformations generated with ϕ-values constrained only at three key residues of Y11, P54, and F94. (a) Experimentally measured ϕ-values and calculated ϕ-values obtained from conformation samples satisfying the ϕ-value constraints of three key residues only. (b) The distributions of cRMSD values of conformations satisfying the ϕ-value constraints of three key residues (white bar) and 24 residues (black bar).
Figure 7
Figure 7
The average point-wise distances of residues between the weighted TSE and the native conformation of protein acylphosphatase. The three circles are the three key residues identified by Vendruscolo et al. (Ref. 40) that have large experimentally measured ϕ values. They have overall small point-wise cRMSD values.
Figure 8
Figure 8
The fractions of preserved native contacts with different sequence separations of protein acylphosphatase for (a) the weighted TSE and (b) the native conformation. Bin 1–11 correspond to sequence separations of 4, 5, 6–10, 11–20, 21–30, 31–40, 41–50, 51–60, 61–70, 71–80, and 81–90, respectively.
Figure 9
Figure 9
Estimating the first passage time (FPT) to folded structure and correlation between FPT and fraction of native contacts among TSE. (a) An illustration of counting the first passage time ξ˜ji as t1t(j). (b) The average FPT of conformations in TSE of AcP. Each point represents a transition state conformation.
Figure 10
Figure 10
The agreement of counted and calculated traveling times between states. For the fixed number of residues L = 11 and the end-to-end distance r = 5 Å, this figure shows: (a) The frequency of different states in the trajectory of the simulation; and the ratio of the counted traveling time to the calculated traveling time for fixed destination state (b) 5, (c) 40, and (d) 75. Except for rarely observed states, counted and calculated traveling times agree well with each other.
Figure 11
Figure 11
First passage time to folded structures and distance in cRMSD to the native structure. (a) The average first passage time of transition state conformations with different cRMSD distance to the native structure. For comparison, (b), (c), and (d) plot the distributions of the first passage time of the conformations in ΩTSE, ΩDS, and ΩNS, respectively. Both unweighted (white bar) and weighted (black bar) distributions are shown.
Figure 12
Figure 12
Effects of different ordering of conformations on clustering. (a) The distributions of cRMSD values of representative conformations of clusters obtained when conformations are ordered by weights (white bar) and when they are randomly ordered (solid black). (b) The distributions of fractions of native contacts for conformation clustering obtained using conformations ordered by weights (white bar) and using random ordered conformations (black bar). Overall, these distributions are very similar.

Similar articles

Cited by

References

    1. Dill K. A., Ozkan S. B., Shell M. S., and Weikl T. R., Annu. Rev. Biophys. 37, 289 (2008).10.1146/annurev.biophys.37.092707.153558 - DOI - PMC - PubMed
    1. Pande V. S., Grosberg A. Yu., Tanaka T., and Rokhsar D. S., Curr. Opin. Struct. Biol. 9, 68 (1998).10.1016/S0959-440X(98)80012-2 - DOI - PubMed
    1. Shakhnovich E., Chem. Rev. 106, 1559 (2006).10.1021/cr040425u - DOI - PMC - PubMed
    1. Bartolini M. and Andrisano V., ChemBioChem 11, 1018 (2010).10.1002/cbic.200900666 - DOI - PubMed
    1. Calosci N., Chi C. N., Richter B., Camilloni C., Engstrom A., Eklund L., Travaglini-Allocatelli C., Gianni S., Vendruscolo M., and Jemth P., Proc. Natl. Acad. Sci. U.S.A. 105, 19241 (2008).10.1073/pnas.0804774105 - DOI - PMC - PubMed

Publication types

Substances