Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun;277(12):2673-82.
doi: 10.1111/j.1742-464X.2010.07684.x.

Protein tandem repeats - the more perfect, the less structured

Affiliations

Protein tandem repeats - the more perfect, the less structured

Julien Jorda et al. FEBS J. 2010 Jun.

Abstract

We analysed the structural properties of protein regions containing arrays of perfect and nearly perfect tandem repeats. Naturally occurring proteins with perfect repeats are practically absent among the proteins with known 3D structures. The great majority of such regions in the Protein Data Bank are found in the proteins designed de novo. The abundance of natural structured proteins with tandem repeats is inversely correlated with the repeat perfection: the chance of finding natural structured proteins in the Protein Data Bank increases with a decrease in the level of repeat perfection. Prediction of intrinsic disorder within the tandem repeats in the SwissProt proteins supports the conclusion that the level of repeat perfection correlates with their tendency to be unstructured. This correlation is valid across the various species and subcellular localizations, although the level of disordered tandem repeats varies significantly between these datasets. On average, in prokaryotes, tandem repeats of cytoplasmic proteins were predicted to be the most structured, whereas in eukaryotes, the most structured portion of the repeats was found in the membrane proteins. Our study supports the hypothesis that, in general, the repeat perfection is a sign of recent evolutionary events rather than of exceptional structural and (or) functional importance of the repeat residues.

Keywords: bioinformatics; disordered conformation; evolution; protein structure; sequence analysis.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
The 3D structures of proteins with almost perfect tandem repeats. Repeat regions are shown in color.
FIGURE 2
FIGURE 2
Compositional profiling of tandem repeats, entire sequences of proteins containing these tandem repeats, and a set of fully disordered proteins from DisProt in comparison with the composition of fully structured proteins from PDB. CStructAA is the content of a given amino acid in the set of structured proteins; CDatasetAA is the content of this amino acid in the dataset of interest. Amino acids are denoted by one letter code and arranged in order of decreasing structure-promoting property suggested by TOP-IDP scale [37].
FIGURE 3
FIGURE 3
(A) Difference of amino acid compositions between tandem repeat segments subdivided into groups with different level of the repeat perfection and fully structured proteins. The homorepeats are analyzed separately (B) due to their unusually high occurrence in comparison to the other tandem repeats. For this purpose, a dataset of perfect and cryptic homorepeats was created and subdivided in three groups depending on the Psim values. CtrAA and ChrAA are the contents of a given amino acid in the set of tandem repeats (excluding homorepeats) and only homorepeats, correspondingly. Amino acids residues are arranged in four sets: order-promoting aromatic and aliphatic amino acids (W, F, Y, I, M, L, V, and A) which are denoted as non-polar; glycine, as order-neutral and, at the same time, specific residue, disorder promoting polar residues (N, C, T, Q, S, R, D, H, E, and K) and disorder-promoting proline.
FIGURE 4
FIGURE 4
Length distribution of predicted disordered segments. (A) Length distribution of predicted disorder for 4 groups of tandem repeats. (B) Length distribution of predicted disorder for whole protein sequences containing the tandem repeats in 4 groups.

Similar articles

Cited by

References

    1. Pellegrini M, Marcotte EM, Yeates TO. A fast algorithm for genome-wide analysis of proteins with repeated sequences. Proteins. 1999;35:440–446. - PubMed
    1. Fraser RDB, MacRae TP. Conformation in fibrous proteins and related synthetic polypeptides. Academic Press; London and New York: 1973.
    1. Yoder MD, Lietzke SE, Jurnak F. Unusual structural features in the parallel beta-helix in pectate lyases. Structure. 1993;1:241–251. - PubMed
    1. Baumann U, Wu S, Flaherty KM, McKay DB. Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif. Embo J. 1993;12:3357–3364. - PMC - PubMed
    1. Kobe B, Kajava AV. The leucine-rich repeat as a protein recognition motif. Curr Opin Struct Biol. 2001;11:725–732. - PubMed

Publication types