. 2010 Mar 1;26(5):625-31.

doi: 10.1093/bioinformatics/btq012. Epub 2010 Jan 16.

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Christian Schaefer¹, Avner Schlessinger, Burkhard Rost

Affiliations

Affiliation

¹ Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics (C2B2), Columbia University, 1130 St Nicholas Ave., New York, NY 10032, USA. schaefer@rostlab.org

PMID: 20081223
PMCID: PMC2828120
DOI: 10.1093/bioinformatics/btq012

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Christian Schaefer et al. Bioinformatics. 2010.

. 2010 Mar 1;26(5):625-31.

doi: 10.1093/bioinformatics/btq012. Epub 2010 Jan 16.

Authors

Christian Schaefer¹, Avner Schlessinger, Burkhard Rost

Affiliation

¹ Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics (C2B2), Columbia University, 1130 St Nicholas Ave., New York, NY 10032, USA. schaefer@rostlab.org

PMID: 20081223
PMCID: PMC2828120
DOI: 10.1093/bioinformatics/btq012

Abstract

Motivation: The mutation of amino acids often impacts protein function and structure. Mutations without negative effect sustain evolutionary pressure. We study a particular aspect of structural robustness with respect to mutations: regular protein secondary structure and natively unstructured (intrinsically disordered) regions. Is the formation of regular secondary structure an intrinsic feature of amino acid sequences, or is it a feature that is lost upon mutation and is maintained by evolution against the odds? Similarly, is disorder an intrinsic sequence feature or is it difficult to maintain? To tackle these questions, we in silico mutated native protein sequences into random sequence-like ensembles and monitored the change in predicted secondary structure and disorder.

Results: We established that by our coarse-grained measures for change, predictions and observations were similar, suggesting that our results were not biased by prediction mistakes. Changes in secondary structure and disorder predictions were linearly proportional to the change in sequence. Surprisingly, neither the content nor the length distribution for the predicted secondary structure changed substantially. Regions with long disorder behaved differently in that significantly fewer such regions were predicted after a few mutation steps. Our findings suggest that the formation of regular secondary structure is an intrinsic feature of random amino acid sequences, while the formation of long-disordered regions is not an intrinsic feature of proteins with disordered regions. Put differently, helices and strands appear to be maintained easily by evolution, whereas maintaining disordered regions appears difficult. Neutral mutations with respect to disorder are therefore very unlikely.

PubMed Disclaimer

Figures

**Fig. 1.**
Secondary structure changes proportional to sequence. (A and C) For decreasing pairwise percentage sequence identity (x-axis, PIDE), we monitored the similarity between secondary structure predictions (Q₃, i.e. percentage of residues identical in one of the three states helix, strand and other) for native and for mutant (yellow: mutations according to PAM120, green: according to background distribution, Section 2). (A and B) show results for a single trajectory, (C and D) the consensus over an ensemble of five trajectories (Section 2). Box plots reflect the range of the distribution (Section 2); median values are marked by horizontal bars and mean values are connected by dotted lines. For instance, at ∼90% pairwise sequence identity, ∼88% of the residues are predicted in the same secondary structure as the native; for the ensemble, this value is slightly higher (leftmost bars in A and C). The curves converge nearly linearly towards values ∼35% corresponding to random. (B and D) For one particular example (PDB identifier 1a2s chain A), we display the actual secondary structure predictions for each mutant: native on top; each row marks one of the 69 mutation steps (Section 2); mutation by PAM120. The top (B) is for one single mutation trajectory, the bottom (D) for an ensemble of five trajectories. One observation stands out and is representative for all such plots that we looked at: blocks of regular secondary appear to be more robust under mutation than the actual type of secondary structure, i.e. helices flip to strands and vice versa and this happens more often than the transitions helix→other and strand→other. Borders are much more ‘fluid’ for the ensemble (D) than for a single mutation trajectory (B).

**Fig. 2.**
Content and length of regular secondary structure unchanged. Box plots and coloring as in Figure 1. Change of regular secondary structure on mutation given by the composition of predicted helix (A) and strand (B), as well as the average lengths of predicted helices (C) and strands (D). The second and third bar on the left in (A) and (B) compare predictions (light gray) with observations (taken from DSSP, dark gray) for the PDB dataset; the first bar on the left in (A) and (B) indicates the degree to which the predictions differ for the PDB dataset (dark gray) and for a set of all human proteins (light blue). The right-most green bars mark the predictions for randomly assembled sequences (Section 2, labeled as ‘Comp’). Overall, neither the length nor the content of regular secondary structure appears to differ between native and random.

**Fig. 3.**
Predicted long disorder changes rapidly. Panels on the left show results for long regions of disorder (30 or more consecutive residues), those on the right for short regions (less than eight). The top panels (A and B) demonstrate how much the predictions of disorder changed over the course of mutations (y-axis: residues predicted identical as disorder between native and mutant as percentage of disorder predicted in native). Disorder predictions differ much more rapidly from native than do secondary structure predictions, and much more for long (A) than for short (B) disorder. The relative content of residues in predicted long (C) and short (D) disordered regions diverge differentially. The first two box plots for (C) depict the observed (dark gray) and predicted (light gray) disordered content in native sequences. Right box plots in both (C) and (D) show the disordered situation in the artificially created dataset sequences (Section 2, labeled as ‘Comp’). For a representative example (DisProt identifier: DP 00006), the IUPred predictions for long (E) and short (F) disorder are shown for each mutant: native on top; each row marks 1 of the 69 PAM120 mutation steps (Section 2). Red lines mark predictions that fall into the threshold category ((30 or more/less than eight). Long disordered regions disappear (E) while especially short disorder remains at both termini, while re- and disappearing in the middle region during mutation (F).

**Fig. 4.**
Examples of proteins with mutation trajectories. For each of the four main SCOP classes (Murzin *et al.*, 1995), we randomly picked one representative short enough to fit into the space here. Ribbon plots were generated by Chimera (Pettersen *et al.*, 2004) [red: helix, green: strand, according to DSSP (Kabsch and Sander, 1983)]. (A–D) In each of the four panels, the ribbon diagram for the native is on the left, and on the right are the 69 mutation trajectories (top: native, degree of mutation decreases downwards; mutations according to PAM120, Section 2). The sequence runs from the most N-terminal residues (labeled ‘1’) to the most C-terminal ones. Note that although we show only single trajectories, rather than ensemble averages here, almost no helix or strand withstands the mutation protocol to the end.

See this image and copyright information in PMC

Cited by

Evolution of Intrinsic Disorder in Protein Loops.
Mughal F, Caetano-Anollés G. Mughal F, et al. Life (Basel). 2023 Oct 14;13(10):2055. doi: 10.3390/life13102055. Life (Basel). 2023. PMID: 37895436 Free PMC article.
Integration of new genes into cellular networks, and their structural maturation.
Abrusán G. Abrusán G. Genetics. 2013 Dec;195(4):1407-17. doi: 10.1534/genetics.113.152256. Epub 2013 Sep 20. Genetics. 2013. PMID: 24056411 Free PMC article.
Proteome-wide evidence for enhanced positive Darwinian selection within intrinsically disordered regions in proteins.
Nilsson J, Grahn M, Wright AP. Nilsson J, et al. Genome Biol. 2011 Jul 19;12(7):R65. doi: 10.1186/gb-2011-12-7-r65. Genome Biol. 2011. PMID: 21771306 Free PMC article.
Large extent of disorder in Adenomatous Polyposis Coli offers a strategy to guard Wnt signalling against point mutations.
Minde DP, Radli M, Forneris F, Maurice MM, Rüdiger SG. Minde DP, et al. PLoS One. 2013 Oct 9;8(10):e77257. doi: 10.1371/journal.pone.0077257. eCollection 2013. PLoS One. 2013. PMID: 24130866 Free PMC article.
The FCS-like zinc finger scaffold of the kinase SnRK1 is formed by the coordinated actions of the FLZ domain and intrinsically disordered regions.
Jamsheer K M, Shukla BN, Jindal S, Gopan N, Mannully CT, Laxmi A. Jamsheer K M, et al. J Biol Chem. 2018 Aug 24;293(34):13134-13150. doi: 10.1074/jbc.RA118.002073. Epub 2018 Jun 26. J Biol Chem. 2018. PMID: 29945970 Free PMC article.

See all "Cited by" articles

References

1. Abagyan RA, Batalov S. Do aligned sequences share the same fold? J. Mol. Biol. 1997;273:355–368. - PubMed
1. Alexov EG, Gunner MR. Incorporating protein conformational flexibility into the calculation of pH-dependent protein properties. Biophys. J. 1997;72:2075–2093. - PMC - PubMed
1. Andersen C.AF, et al. Continuum secondary structure captures protein flexibility. Structure. 2002;10:175–184. - PubMed
1. Anfinsen CB, Scheraga HA. Experimental and theoretical aspects of protein folding. Adv. Prot. Chem. 1975;29:205–300. - PubMed
1. Benner SA, et al. Bona fide predictions of protein secondary structure using transparent analyses of multiple sequence alignments. Chem. Rev. 1997;97:2725–2844. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Affiliation

Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources