Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;8(1):e53642.
doi: 10.1371/journal.pone.0053642. Epub 2013 Jan 24.

Polymer uncrossing and knotting in protein folding, and their role in minimal folding pathways

Affiliations

Polymer uncrossing and knotting in protein folding, and their role in minimal folding pathways

Ali R Mohazab et al. PLoS One. 2013.

Abstract

We introduce a method for calculating the extent to which chain non-crossing is important in the most efficient, optimal trajectories or pathways for a protein to fold. This involves recording all unphysical crossing events of a ghost chain, and calculating the minimal uncrossing cost that would have been required to avoid such events. A depth-first tree search algorithm is applied to find minimal transformations to fold [Formula: see text], [Formula: see text], [Formula: see text], and knotted proteins. In all cases, the extra uncrossing/non-crossing distance is a small fraction of the total distance travelled by a ghost chain. Different structural classes may be distinguished by the amount of extra uncrossing distance, and the effectiveness of such discrimination is compared with other order parameters. It was seen that non-crossing distance over chain length provided the best discrimination between structural and kinetic classes. The scaling of non-crossing distance with chain length implies an inevitable crossover to entanglement-dominated folding mechanisms for sufficiently long chains. We further quantify the minimal folding pathways by collecting the sequence of uncrossing moves, which generally involve leg, loop, and elbow-like uncrossing moves, and rendering the collection of these moves over the unfolded ensemble as a multiple-transformation "alignment". The consensus minimal pathway is constructed and shown schematically for representative cases of an [Formula: see text], [Formula: see text], and knotted protein. An overlap parameter is defined between pathways; we find that [Formula: see text] proteins have minimal overlap indicating diverse folding pathways, knotted proteins are highly constrained to follow a dominant pathway, and [Formula: see text] proteins are somewhere in between. Thus we have shown how topological chain constraints can induce dominant pathway mechanisms in protein folding.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Approximate minimal transformation for a simple conformation pair, and the degree to which link length changes.
(a) Several intermediate conformations for a transformation (A–G proceeding along the color sequence red, green, yellow, cyan, magenta, gray, and blue) are shown. The step-size delta is shown. Note the step in which the first bead of the chain (formula image) is “snapped” into the final conformation because its distance to the destination is less than formula image (going from D to E). In the intermediate conformation F (Gray), beads 0 to 3 have reached their final locations and no longer move. Note also the link length violation of link 4 in conformation F, due to the approximation that ignores end point rotations, for this intermediate figure. A milder violation is observed when going from D (cyan) to E (magenta), since bead 1 through formula image all assume a step size of formula image while bead 0 moved a step size formula image. (b) Panel b shows a surface plot showing link length as a function of link number and step number during transformation. For the whole process, mean link length formula image is 0.98 units and standard deviation formula image is 0.063.
Figure 2
Figure 2. Link length statistics for randomly generated transformation pairs.
Histogram of the average link length over the course of a transformation, for transformations between 200 randomly generated structures of 9 links and the (randomly generated) reference structure shown in the inset to the figure. The “native” or reference state is shown in the inset, along with several of the 200 initial states. For the ensemble of transformations shown, the ensemble average of the mean link length is 0.96.
Figure 3
Figure 3. Crossing detection using projections.
(a) A 3 link chain with its vertical projection. A crossing in the projection is shown with a green circle. The crossing in the projection occurs at points formula image and formula image, where the chain is parametrized uniformly from 0 to 3. Since link 1 is under link 3 at the point of projection crossing, 0.29 will appear with a negative sign in the corresponding formula image (eqn 1). (b) The blue chain and the red chain have the exact same vertical projection, however their corresponding formula image matrices are different in sign, as given in Eq. 2. This indicates that the over-under sense has changed for the links whose projections are crossing. This in turn indicates that a true crossing has occurred when going from the red conformation to the blue conformation, as opposed to a series of conformations where the chain has navigated to conformations having the opposite crossing sense without passing through itself.
Figure 4
Figure 4. Two possible minimal uncrossing transformations.
Two possible untangling transformations. The top transformation involves twisting of the loop. The lower transformation involves a snake like movement of the vertical leg. A third one would involve moving the horizontal leg, in a similar snake-like fashion. Note that the moves represented here are not necessarily the most efficient ones in their topological class, but rather the most intuitive ones. There are transformations that are topologically equivalent but generally involve less total motion of the chain (see for example Figures 11(a), 11(b)).
Figure 5
Figure 5. Accounting for history-dependence in minimal uncrossing transformations.
The minimal untangling movement in going from A to C (through B′) is less than the sum of the minimum untangling movements going from A to B and then from B to C.
Figure 6
Figure 6. Snapshots of a transformation with two crossings.
A few snapshots during a transformation involving 2 instances of chain crossing. The transformation occurs clockwise starting from initial configuration I and proceeding to final configuration F.
Figure 7
Figure 7. Identification of leg-uncrossing.
For the crossing points indicated by the green circles, two legs, colored blue and red, can be identified. Each leg starts at the crossing and terminates at an end.
Figure 8
Figure 8. Crossing substructures.
(a) A single leg structure, (b) A loop structure, (c) An elbow structure.
Figure 9
Figure 9. Schematic illustration of the canonical leg movement.
Schematic illustration of the canonical leg movement, either from left to right as in (a) or effectively its time reverse as in (b). Both transformations traverse the same distance. The transformation in (a) is equivalent to the “plug” transformation analyzed in the context of folding simulations for trefoil knotted proteins , while the transformation in (b) (see ref. for a detailed description of this transformation) is equivalent to the “slipknotting” transformation more often observed in the folding of knotted proteins .
Figure 10
Figure 10. A single leg movement can undo several crossings.
One can reverse the over-under nature of all the crossings that have occurred on a leg, through a single leg movement.
Figure 11
Figure 11. Relation of minimal loop uncrossing to Reidemeister type I moves.
(a)Reversing the over-under nature of a crossing through a topological loop twist: Reidemeister move type I. (b) By “pinching” the loop before the twist, the cost in distance for changing the crossing nature is reduced.
Figure 12
Figure 12. Schematic of the canonical elbow move.
Schematic of the canonical elbow move. From left to right.
Figure 13
Figure 13. A simple example depicting various crossing substructures.
A chain with several self-crossing points before and after untangling. Various topological substructures that are discussed in the text are color coded. For the case of the legs (red and cyan) note that various other legs can be identified, for example a leg that starts at crossing 2 and ends at the red terminus. Here we color only the shortest legs from crossing 1 to the terminus as red, and crossing 2 to the opposite terminus as cyan.
Figure 14
Figure 14. Illustration of the depth-first tree search algorithm for the given crossing structure shown.
An example (subset) tree of possible transformations for a given crossing structure. Accumulated distances are given inside the circles representing nodes of the tree; the non-crossing transformations and their corresponding distances are shown next to the branches of the tree. The algorithm starts from the bottom node and proceeds to the top nodes, starting in this case along the right-most branch. The possible transformations to be considered as candidate minimal transformations are : formula image, formula image, formula image which then terminates because the accumulated distance exceeds the minimum so far of 25, and formula image.
Figure 15
Figure 15. Clustering of protein classes depending on order parameter.
(A) Scatter plot of all proteins as a function of formula image and formula image. Knotted proteins are indicated as green circles and are clustered; unknotted proteins are clustered using with the black closed curve, and contain formula image-helical proteins clustered in red, and mixed formula image-formula image proteins clustered in magenta. Beta proteins are indicated in blue. Two and three state proteins are indicated as triangles and squares respectively. LRO provides a strong discriminant agains formula image and mixed proteins, but not knotted and unknotted proteins, while formula image discriminates knotted from unknotted proteins, and moderately discriminates formula image proteins from mixed proteins. (B) Scatter plot of all proteins as a function of formula image and formula image. The rendering scheme for protein classes is the same as in panel (A). Kinetic 2-state folders are indicated by the black dashed curve. Both formula image and formula image distinguish knotted from unknotted proteins, and 2-state from 3-state proteins. By projecting formula image proteins and either mixed formula image/formula image or all-formula image proteins onto each order parameter, one can see how formula image can discriminate formula image proteins from both mixed or formula image proteins, while formula image cannot. This is despite the significant correlation between formula image and formula image.
Figure 16
Figure 16. Statistical significance for all order parameters in distinguishing between different classes of proteins.
The -log of the statistical significance is plotted as a function of pairs of protein classes, so that a higher number indicates better ability to distinguish between different classes. The blue horizontal line indicates a threshold of formula image for statistical significance.
Figure 17
Figure 17. Approximate scaling laws for MRSD and non-crossing distance per residue
formula image , across proteins and for domains within a single protein. (A) MRSD (blue circles) as a function of chain length formula image for our protein dataset. The slope of the best fit line on the log-log plot gives the power law scaling: formula image. Non-crossing distance per residue formula image (red circles) vs. formula image shows much larger scatter across native topologies, but follows an approximate scaling law formula image which is superextensive, indicating an increasing importance of chain non-crossing per residue as system size is increased. At system sizes larger than formula image, even minimal motion is dominated by entanglement. (B) Same quantities as in panel (A), but for the domains in proteins 2A5E and 2HA8. The scaling laws are different than in panel (A), and show stronger chain-length dependence. For 2HA8, domains 1, 2 and 1-2 together (the full protein) are considered; for 2A5E domains 1,2,3,4, 1-2, 2-3, 3-4, 1-2-3, 2-3-4, and 1-2-3-4 (the full protein) are considered. Based on these scaling laws found by building up proteins from subdomains, at system sizes larger than formula image, minimal pathways become entanglement-dominated. (C) Schematic renderings of the domains, color-coded in 2HA8 (left) and 2A5E (right).
Figure 18
Figure 18. Schematic renderings of the three proteins whose minimal transformations we investigate in detail.
(A) acyl-coenzyme A binding protein, PDB id 2ABD , an all-formula image protein; (B) Src homology 3 (SH3) domain of phosphatidylinositol 3-kinase, PDB id 1PKS , a largely formula image protein; (C) The designed knotted protein 2ouf-knot, PDB id 3MLG .
Figure 19
Figure 19. Bar plots for the uncrossing operations involved in minimal transformations from an unfolded ensemble, for the protein 2ABD.
The sequence of noncrossing operations the transformation corresponding to a given pair of conformations is represented as a color-coded series of bars, with the sequence of moves going from right to left, and the length of the bar indicating the non-crossing distance undertaken by a particular move. Red bars indicate N-terminal leg (formula image) uncrossing, green bars indicate C-terminal leg (formula image) uncrossing, blue bars indicate Reidemeister “pinch and twist” loop uncrossing moves, and cyan bars indicate elbow uncrossing moves. The same set of 172 transformations is shown in panels A and B. Panel A sorts uncrossing transformations by rank ordering the following move types, largest to smallest: formula image, formula image, loop uncrossing, elbow move. Panel B sorts moves by formula image, formula image, loop uncrossing, elbow move. The scale bar underneath each panel indicates a distance of 100 in units of the link length. The arrow in each panel denotes the “most representative” transformation, as defined in the text.
Figure 20
Figure 20. Bar plots of the uncrossing operations involved in minimal transformations for the -sheet protein 1PKS.
See Figure 19 and the text for more details. Red bars: formula image uncrossing moves; green bars: formula image uncrossing moves; Blue bars: loop uncrossing moves; Cyan bars: elbow uncrossing moves. The same set of 195 transformations is shown in panels A and B, sorted as in Figure 19. The scale bar underneath each panel indicates a distance of 100 in units of the link length.
Figure 21
Figure 21. Bar plots of the uncrossing operations involved in the minimal transformations for the knotted protein 3MLG.
See Figure 19 and the text for more details. Red bars: formula image uncrossing moves; green bars: formula image uncrossing moves; Blue bars: loop uncrossing moves; Cyan bars: elbow uncrossing moves. The same set of 90 transformations is shown in panels A and B, sorted as in Figure 19. The scale bar underneath each panel indicates a distance of 100 in units of the link length. The arrow in each panel denotes the “most representative” transformation, as defined in the text. The transformation located 8 bars up from the bottom of Panel A requires both formula image and formula image moves, however both leg motions are very small.
Figure 22
Figure 22. Consensus histograms of the transformations described in Figures 19, 20, 21.
See text for a description of the construction. Each bar represents the distance of a corresponding move type, N or C leg (formula image or formula image), elbow formula image, or loop formula image. The order of the sequence of moves is taken from right to left along the x-axis. An all-formula image protein (2ABD), an all-formula image protein (1PKS), and a knotted protein (3MLG) are considered. (A) Transformations with leg formula image as the largest move. These encompass 15% of the transformations those in the formula image protein, 16% of the transformations in the formula image protein, and 73% of the transformations for the knotted protein. (B) Transformations with leg formula image as the largest move, which encompass 13% of the formula image protein transformations, 54% of formula image protein transformations, and 18% of knotted protein transformations. (C) Transformations with either an elbow E or loop R as the largest move, which encompass 71% of the formula image protein transformations, 29% of formula image protein transformations, and 9% of knotted protein transformations.
Figure 23
Figure 23. Schematic of the most representative transformation for the protein 2ABD.
Figure 24
Figure 24. Schematic of the most representative transformation for the protein 1PKS.
Figure 25
Figure 25. Schematic of the most representative transformation for the knotted protein 3MLG.
Figure 26
Figure 26. Overlap between minimal transformations.
Schematic diagram for the residues involved in uncrossing operations for two minimal transformations labelled by formula image and formula image, to illustrate the sequence overlap between transformations.
Figure 27
Figure 27. Distribution of pathway overlap between minimal transformations, for an , , and knotted protein.
Pathway overlap (formula image) distributions for the 3 proteins in Figure 18, as defined by Equation (8), operating on the transformations in Figure 19, 20, 21. (a) The pathway overlap distribution for the all-formula image protein 2ABD indicates a large contribution for formula image (the peak height in the distribution is formula image), indicating a diverse set of minimal transformations fold the protein. The average formula image for these transformations is formula image. (b) The pathway overlap distribution for the formula image-protein shows the emergence of a peak around formula image, indicating partial restriction of folding pathways. The peak near formula image still carries more weight in the distribution. The average formula image. (c) The peak around formula image becomes dominant for the pathway overlap distribution of the knotted protein, indicating the emergence of a dominant restricted minimal folding pathway. The average formula image.

References

    1. Wolynes PG (1992) Spin glass ideas and the protein folding problems. In: Stein D, editor, Spin Glasses and Biology. Singapore: World Scientific, pp. 225–259.
    1. Chan HS, Dill KA (1993) The protein folding problem. Phys Today 46: 24–32.
    1. Wolynes PG, Onuchic JN, Thirumalai D (1995) Navigating the folding routes. Science 267: 1619–1620. - PubMed
    1. Garel T, Orland H, Thirumalai D (1996) Analytical theories of protein folding. In: Elber R, editor, New Developments in theoretical studies of proteins, Singapore: World Scientific. pp. 197–268.
    1. Dobson CM, Sali A, Karplus M (1998) Protein folding: A perspective from theory and experiment. Angew Chem Int Ed Engl 37: 868–893. - PubMed

Publication types

LinkOut - more resources