Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 1;26(5):625-31.
doi: 10.1093/bioinformatics/btq012. Epub 2010 Jan 16.

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Affiliations

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Christian Schaefer et al. Bioinformatics. .

Abstract

Motivation: The mutation of amino acids often impacts protein function and structure. Mutations without negative effect sustain evolutionary pressure. We study a particular aspect of structural robustness with respect to mutations: regular protein secondary structure and natively unstructured (intrinsically disordered) regions. Is the formation of regular secondary structure an intrinsic feature of amino acid sequences, or is it a feature that is lost upon mutation and is maintained by evolution against the odds? Similarly, is disorder an intrinsic sequence feature or is it difficult to maintain? To tackle these questions, we in silico mutated native protein sequences into random sequence-like ensembles and monitored the change in predicted secondary structure and disorder.

Results: We established that by our coarse-grained measures for change, predictions and observations were similar, suggesting that our results were not biased by prediction mistakes. Changes in secondary structure and disorder predictions were linearly proportional to the change in sequence. Surprisingly, neither the content nor the length distribution for the predicted secondary structure changed substantially. Regions with long disorder behaved differently in that significantly fewer such regions were predicted after a few mutation steps. Our findings suggest that the formation of regular secondary structure is an intrinsic feature of random amino acid sequences, while the formation of long-disordered regions is not an intrinsic feature of proteins with disordered regions. Put differently, helices and strands appear to be maintained easily by evolution, whereas maintaining disordered regions appears difficult. Neutral mutations with respect to disorder are therefore very unlikely.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Secondary structure changes proportional to sequence. (A and C) For decreasing pairwise percentage sequence identity (x-axis, PIDE), we monitored the similarity between secondary structure predictions (Q3, i.e. percentage of residues identical in one of the three states helix, strand and other) for native and for mutant (yellow: mutations according to PAM120, green: according to background distribution, Section 2). (A and B) show results for a single trajectory, (C and D) the consensus over an ensemble of five trajectories (Section 2). Box plots reflect the range of the distribution (Section 2); median values are marked by horizontal bars and mean values are connected by dotted lines. For instance, at ∼90% pairwise sequence identity, ∼88% of the residues are predicted in the same secondary structure as the native; for the ensemble, this value is slightly higher (leftmost bars in A and C). The curves converge nearly linearly towards values ∼35% corresponding to random. (B and D) For one particular example (PDB identifier 1a2s chain A), we display the actual secondary structure predictions for each mutant: native on top; each row marks one of the 69 mutation steps (Section 2); mutation by PAM120. The top (B) is for one single mutation trajectory, the bottom (D) for an ensemble of five trajectories. One observation stands out and is representative for all such plots that we looked at: blocks of regular secondary appear to be more robust under mutation than the actual type of secondary structure, i.e. helices flip to strands and vice versa and this happens more often than the transitions helix→other and strand→other. Borders are much more ‘fluid’ for the ensemble (D) than for a single mutation trajectory (B).
Fig. 2.
Fig. 2.
Content and length of regular secondary structure unchanged. Box plots and coloring as in Figure 1. Change of regular secondary structure on mutation given by the composition of predicted helix (A) and strand (B), as well as the average lengths of predicted helices (C) and strands (D). The second and third bar on the left in (A) and (B) compare predictions (light gray) with observations (taken from DSSP, dark gray) for the PDB dataset; the first bar on the left in (A) and (B) indicates the degree to which the predictions differ for the PDB dataset (dark gray) and for a set of all human proteins (light blue). The right-most green bars mark the predictions for randomly assembled sequences (Section 2, labeled as ‘Comp’). Overall, neither the length nor the content of regular secondary structure appears to differ between native and random.
Fig. 3.
Fig. 3.
Predicted long disorder changes rapidly. Panels on the left show results for long regions of disorder (30 or more consecutive residues), those on the right for short regions (less than eight). The top panels (A and B) demonstrate how much the predictions of disorder changed over the course of mutations (y-axis: residues predicted identical as disorder between native and mutant as percentage of disorder predicted in native). Disorder predictions differ much more rapidly from native than do secondary structure predictions, and much more for long (A) than for short (B) disorder. The relative content of residues in predicted long (C) and short (D) disordered regions diverge differentially. The first two box plots for (C) depict the observed (dark gray) and predicted (light gray) disordered content in native sequences. Right box plots in both (C) and (D) show the disordered situation in the artificially created dataset sequences (Section 2, labeled as ‘Comp’). For a representative example (DisProt identifier: DP 00006), the IUPred predictions for long (E) and short (F) disorder are shown for each mutant: native on top; each row marks 1 of the 69 PAM120 mutation steps (Section 2). Red lines mark predictions that fall into the threshold category ((30 or more/less than eight). Long disordered regions disappear (E) while especially short disorder remains at both termini, while re- and disappearing in the middle region during mutation (F).
Fig. 4.
Fig. 4.
Examples of proteins with mutation trajectories. For each of the four main SCOP classes (Murzin et al., 1995), we randomly picked one representative short enough to fit into the space here. Ribbon plots were generated by Chimera (Pettersen et al., 2004) [red: helix, green: strand, according to DSSP (Kabsch and Sander, 1983)]. (AD) In each of the four panels, the ribbon diagram for the native is on the left, and on the right are the 69 mutation trajectories (top: native, degree of mutation decreases downwards; mutations according to PAM120, Section 2). The sequence runs from the most N-terminal residues (labeled ‘1’) to the most C-terminal ones. Note that although we show only single trajectories, rather than ensemble averages here, almost no helix or strand withstands the mutation protocol to the end.

Similar articles

Cited by

References

    1. Abagyan RA, Batalov S. Do aligned sequences share the same fold? J. Mol. Biol. 1997;273:355–368. - PubMed
    1. Alexov EG, Gunner MR. Incorporating protein conformational flexibility into the calculation of pH-dependent protein properties. Biophys. J. 1997;72:2075–2093. - PMC - PubMed
    1. Andersen C.AF, et al. Continuum secondary structure captures protein flexibility. Structure. 2002;10:175–184. - PubMed
    1. Anfinsen CB, Scheraga HA. Experimental and theoretical aspects of protein folding. Adv. Prot. Chem. 1975;29:205–300. - PubMed
    1. Benner SA, et al. Bona fide predictions of protein secondary structure using transparent analyses of multiple sequence alignments. Chem. Rev. 1997;97:2725–2844. - PubMed

Publication types