Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan 9;385(1):312-29.
doi: 10.1016/j.jmb.2008.10.018. Epub 2008 Oct 15.

Principal component analysis for protein folding dynamics

Affiliations

Principal component analysis for protein folding dynamics

Gia G Maisuradze et al. J Mol Biol. .

Abstract

Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental NMR structure of the triple β-strand WW domain from the Formin binding protein 28 (FBP) (1E0L).
Figure 2
Figure 2
The first three principal components and rmsd from the native structure of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K for 1E0L.
Figure 2
Figure 2
The first three principal components and rmsd from the native structure of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K for 1E0L.
Figure 2
Figure 2
The first three principal components and rmsd from the native structure of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K for 1E0L.
Figure 3
Figure 3
Free energy profiles of the first three principal components (qi) for fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K for 1E0L. The numbers 1, 2, 3 within each panel refer to PC1, PC2 and PC3.
Figure 4
Figure 4
The first principal component (a) and the cosine contents of PC1 (b) of a fast folding trajectory for 1E0L at 330K for different time scales, starting from random diffusion (lines 1 and 2) and ending with a full trajectory (ninth line).
Figure 5
Figure 5
Free energy landscapes (in kcal/mol) for 1E0L with representative structures at the minima of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K, and an extremely fast-folding MD trajectory at 335K (d). A1–A5, B1–B7, C1–C6, D1–D4 are the minima on the free energy landscapes. The structures are colored from blue to red from the N- to the C-terminus.
Figure 5
Figure 5
Free energy landscapes (in kcal/mol) for 1E0L with representative structures at the minima of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K, and an extremely fast-folding MD trajectory at 335K (d). A1–A5, B1–B7, C1–C6, D1–D4 are the minima on the free energy landscapes. The structures are colored from blue to red from the N- to the C-terminus.
Figure 5
Figure 5
Free energy landscapes (in kcal/mol) for 1E0L with representative structures at the minima of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K, and an extremely fast-folding MD trajectory at 335K (d). A1–A5, B1–B7, C1–C6, D1–D4 are the minima on the free energy landscapes. The structures are colored from blue to red from the N- to the C-terminus.
Figure 5
Figure 5
Free energy landscapes (in kcal/mol) for 1E0L with representative structures at the minima of fast- (a), slow- (b), and non-folding (c) MD trajectories at 330K, and an extremely fast-folding MD trajectory at 335K (d). A1–A5, B1–B7, C1–C6, D1–D4 are the minima on the free energy landscapes. The structures are colored from blue to red from the N- to the C-terminus.
Figure 6
Figure 6
The mean square displacement of PC1 for the fast-folding MD trajectory for 1E0L at 330K, i.e., below the folding temperature. The black solid line corresponds to the full trajectory, the red solid and dashed lines correspond to the native and the first half of the unfolded states, respectively; the blue solid and dashed lines correspond to the entire unfolded and transition states, respectively; the black dashed and dash-dot lines correspond to t0.5 and t1, respectively.
Figure 7
Figure 7
The rmsd as a function of time for MD trajectories for 1E0L at 335 K (a), at 350 K (b), and at 360 K (c).
Figure 8
Figure 8
The mean square displacement of PC1 of the very fast-folding MD trajectory for 1E0L at 335K. The red dashed line illustrates the MSD of PC1 calculated for the time interval of first folding [~ 27.5 ns in Fig. 7(a)]; the red solid line is the MSD of PC1 for the full trajectory [Fig. 7(a)]; the black dashed and dash-dot lines correspond to t0.5 and t1, respectively.
Figure 9
Figure 9
(a) The probability distribution function of PC1, computed from the fast-folding MD trajectory at 330K [Fig. 2(a)]. The red solid and dashed lines correspond to the pdf of the full trajectory and the native state, respectively; the blue solid and dashed lines correspond to the pdf of the unfolded and transition state, respectively. (b) The pdf as a function of the dimensionless q (with A =1) of the analytical cosine function of Brownian motion.
Figure 10
Figure 10
The UNRES model of polypeptide chains. The interaction sites are red side-chain centroids of different sizes (SC) and the peptide-bond centers (p) are indicated by green circles, whereas the α-carbon atoms (small empty circles) are introduced only to assist in defining the geometry. The virtual Cα···Cα bonds have a fixed length of 3.8 Ǻ, corresponding to a trans peptide group; the virtual-bond (θ) and virtual-dihedral (γ) angles are variable. Each side chain is attached to the corresponding α-carbon with a fixed “bond length”, bSCi, variable “bond angle”, αi, formed by SCi and the bisector of the angle defined by Ci1α,Ciα, and Ci+1α, and with a variable “dihedral angle”, βi, of counter-clockwise rotation about the Ci1α,Ciα,Ci+1α frame.

References

    1. Macias MJ, Gervais V, Civera C, Oschkinat H. Structural analysis of WW domains and design of a WW prototype. Nat Struct Biol. 2000;7:375–379. - PubMed
    1. Serpell LC. Alzheimer’s amyloid fibrils: structure and assembly. Biochim Biophys Acta. 2000;1502:16–30. - PubMed
    1. Pruisner SB. Prions. Proc Natl Acad Sci USA. 1998;95:13363–13383. - PMC - PubMed
    1. Sudol M. The WW domain binds polyprolines and is involved in human diseases. Exp Mol Med. 1996;28:65–69.
    1. Passani LA, Bedford MT, Faber PW, McGinnis KM, Sharp AH, Gusella JF, Vonsattel JP, MacDonald ME. Huntingtin’s WW domain partners in Huntington’s disease post-mortem brain fulfill genetic criteria for direct involvement in Huntington’s disease pathogenesis. Human Mol Gen. 2000;9:2175–2182. - PubMed

Publication types

LinkOut - more resources