Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 5;25(5):2994.
doi: 10.3390/ijms25052994.

Structured Tandem Repeats in Protein Interactions

Affiliations

Structured Tandem Repeats in Protein Interactions

Juan Mac Donagh et al. Int J Mol Sci. .

Abstract

Tandem repeats (TRs) in protein sequences are consecutive, highly similar sequence motifs. Some types of TRs fold into structural units that pack together in ensembles, forming either an (open) elongated domain or a (closed) propeller, where the last unit of the ensemble packs against the first one. Here, we examine TR proteins (TRPs) to see how their sequence, structure, and evolutionary properties favor them for a function as mediators of protein interactions. Our observations suggest that TRPs bind other proteins using large, structured surfaces like globular domains; in particular, open-structured TR ensembles are favored by flexible termini and the possibility to tightly coil against their targets. While, intuitively, open ensembles of TRs seem prone to evolve due to their potential to accommodate insertions and deletions of units, these evolutionary events are unexpectedly rare, suggesting that they are advantageous for the emergence of the ancestral sequence but are early fixed. We hypothesize that their flexibility makes it easier for further proteins to adapt to interact with them, which would explain their large number of protein interactions. We provide insight into the properties of open TR ensembles, which make them scaffolds for alternative protein complexes to organize genes, RNA and proteins.

Keywords: protein evolution; protein flexibility; protein structure; protein–protein interactions; tandem repeats.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Properties of TRs by type and compared to other protein regions. (a) Number of repeat units identified. HEAT_AAA, HEAT_ADB and HEAT_IMB are three variants of HEAT repeats and can be redundant. TRs forming open and closed ensembles are indicated with black and red labels, respectively. (b) Average number of protein partners for TRPs. Horizontal dotted lines indicate the values for all human proteins (proteome), for proteins annotated with globular domains (Pfam), and for all TRPs. (c) Frequency of amino acids in SLiMs. Values are shown for: the complete human proteome (All), for proteins that do not contain TRs (non TRPs), for TRPs, for residues in TRs of TRPs (TRPs: in TRs), for residues outside TRs in TRPs (TRPs: out TRs), and in annotated globular domains (Pfam). (d) Frequency of amino acids in phosphorylation sites. (e) Frequency of amino acids in IDRs. (f) Disordered content by repeat type (PFTA and PFTB are not displayed because their numbers are too low). To prepare the plots (cf), we obtained the coordinates of the features (SLiMs, phosphorylation sites, disorder regions) in all human sequences from the corresponding databases (see Methods for details) and then divided the number of residues within the given feature by the total of residues in the corresponding type of sequence.
Figure 2
Figure 2
pLDDT scores of AlphaFold predictions along ensembles of TRs. (a) WD40 (closed ensemble) compared to an average of four open ensembles (shown in (b)). (b) Values for the four open ensembles. The x-axis indicates the relative position in the TR ensemble N- to C-terminal.
Figure 3
Figure 3
Flexibility of interacting TRPs. Structures of protein complexes with TRPs by repeat type. Elongated: ARM repeats in human catenin beta-1 binding NR5A2 (PDB:3TX7); LRR repeats in Toll-like receptor 4 binding LY96 (PDB:4G8A); HEAT repeats in importin subunit beta-1 shown in the same orientation, forming three complexes with histone H1.0 (PDB:6N88), Zinc finger protein SNAI1 (PDB:3W5K) and the IBB domain of Snurportin-1 (PDB:2QNA). Cyclic: RCC1 in RPGR repeats binding the interacting domain of RPGRIP1 (PDB:4QAM) and KELCH repeats in ARPC1B binding ARPC4 (PDB:6YW6). TRP in purple and bound protein in yellow.
Figure 4
Figure 4
Cases of fast evolution in Plasmodium species. (a) Gain of an LRR unit in A0A1D3JKH0_PLAMA positions 595–618. (b) Gain of a TPR unit in A0A1D3TFE6_PLAMA positions 660–693 (colored in red in the structure). Sequence identifiers from UniProtKB. In order, species are: Plasmodium berghei, Plasmodium relictum, Plasmodium malariae, Plasmodium knowlesi, Plasmodium gonderi and Plasmodium falciparum. No alternative spliced isoforms for A0A1D3JKH0_PLAMA or A0A1D3TFE6_PLAMA are given in UniProt (February 2024). The structures shown are models from AlphaFold [41] and Robetta [42] (left and right, respectively).
Figure 4
Figure 4
Cases of fast evolution in Plasmodium species. (a) Gain of an LRR unit in A0A1D3JKH0_PLAMA positions 595–618. (b) Gain of a TPR unit in A0A1D3TFE6_PLAMA positions 660–693 (colored in red in the structure). Sequence identifiers from UniProtKB. In order, species are: Plasmodium berghei, Plasmodium relictum, Plasmodium malariae, Plasmodium knowlesi, Plasmodium gonderi and Plasmodium falciparum. No alternative spliced isoforms for A0A1D3JKH0_PLAMA or A0A1D3TFE6_PLAMA are given in UniProt (February 2024). The structures shown are models from AlphaFold [41] and Robetta [42] (left and right, respectively).
Figure 5
Figure 5
Functional enrichment of TRP interactors. Top: Biological Process. Middle: Cellular Component. Bottom: Molecular Function. Gene Ontology (GO) enrichment analysis was carried out for a set of TRP interactors (All) and then separately for the interactors of each TRP type (see Methods for details). Enriched GO Biological Process (BP), Molecular Function (MF) and Cellular Component (CC) terms with the lowest adjusted p-value were kept.

References

    1. Kobe B., Kajava A.V. When protein folding is simplified to protein coiling: The continuum of solenoid protein structures. Trends Biochem. Sci. 2000;25:509–515. doi: 10.1016/S0968-0004(00)01667-4. - DOI - PubMed
    1. Monzon A.M., Arrías P.N., Elofsson A., Mier P., Andrade-Navarro M.A., Bevilacqua M., Clementel D., Bateman A., Hirsh L., Fornasari M.S., et al. A STRP-ed definition of Structured Tandem Repeats in Proteins. J. Struct. Biol. 2023;215:108023. doi: 10.1016/j.jsb.2023.108023. - DOI - PubMed
    1. Kajava A.V. Tandem repeats in proteins: From sequence to structure. J. Struct. Biol. 2012;179:279–288. doi: 10.1016/j.jsb.2011.08.009. - DOI - PubMed
    1. Groves M.R., Barford D. Topological characteristics of helical repeat proteins. Curr. Opin. Struct. Biol. 1999;9:383–389. doi: 10.1016/S0959-440X(99)80052-9. - DOI - PubMed
    1. Kajava A.V., Steven A.C. Beta-rolls, beta-helices, and other beta-solenoid proteins. Adv. Protein Chem. 2006;73:55–96. doi: 10.1016/S0065-3233(06)73003-0. - DOI - PubMed

LinkOut - more resources