Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 14;20(1):30.
doi: 10.1186/s12862-020-1591-0.

The modular nature of protein evolution: domain rearrangement rates across eukaryotic life

Affiliations

The modular nature of protein evolution: domain rearrangement rates across eukaryotic life

Elias Dohmen et al. BMC Evol Biol. .

Abstract

Background: Modularity is important for evolutionary innovation. The recombination of existing units to form larger complexes with new functionalities spares the need to create novel elements from scratch. In proteins, this principle can be observed at the level of protein domains, functional subunits which are regularly rearranged to acquire new functions.

Results: In this study we analyse the mechanisms leading to new domain arrangements in five major eukaryotic clades (vertebrates, insects, fungi, monocots and eudicots) at unprecedented depth and breadth. This allows, for the first time, to directly compare rates of rearrangements between different clades and identify both lineage specific and general patterns of evolution in the context of domain rearrangements. We analyse arrangement changes along phylogenetic trees by reconstructing ancestral domain content in combination with feasible single step events, such as fusion or fission. Using this approach we explain up to 70% of all rearrangements by tracing them back to their precursors. We find that rates in general and the ratio between these rates for a given clade in particular, are highly consistent across all clades. In agreement with previous studies, fusions are the most frequent event leading to new domain arrangements. A lineage specific pattern in fungi reveals exceptionally high loss rates compared to other clades, supporting recent studies highlighting the importance of loss for evolutionary innovation. Furthermore, our methodology allows us to link domain emergences at specific nodes in the phylogenetic tree to important functional developments, such as the origin of hair in mammals.

Conclusions: Our results demonstrate that domain rearrangements are based on a canonical set of mutational events with rates which lie within a relatively narrow and consistent range. In addition, gained knowledge about these rates provides a basis for advanced domain-based methodologies for phylogenetics and homology analysis which complement current sequence-based methods.

Keywords: Ancestral reconstruction; Evolutionary history; Protein domain; Proteome analysis; Rearrangement rates.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Frequency of the different solution types. Exact and non-ambiguous solutions can be found in about 50% of the cases
Fig. 2
Fig. 2
Number of rearrangement events across the eudicot phylogeny. Digit representation of the total number of rearrangement events at a specific node is indicated next to the pie chart. For details on ’Outgroups’ see Methods. Significant GO terms in gained domain arrangements are shown in a tag cloud (box). GO terms that might point to eudicot specific evolution are: ’recognition of pollen’ and ’plant-type cell wall organization’
Fig. 3
Fig. 3
Reconstruction of ancestral domain content and rearrangement events. Given a known phylogeny and domain annotations of all included species (a), it becomes possible to infer six event types leading to new domain contents over time (b). First, the ancestral domain content of all inner nodes is inferred by two different parsimony approaches: for all single domains using a Dollo parsimony approach (light blue background), and for all arrangements, using a Fitch parsimony approach (light orange background). In a first traversal from the leaves to the root of the tree, all inner node states are annotated as present, absent or unknown according to the regarding parsimony rules (c) (see Additional file 1). In a second traversal from the root to the leaves, the unknown states at the root are first resolved according to the parsimony rules (see Additional file 1) and subsequently all following unknown states set to the parental state (d). In the reconstructed tree it becomes possible to infer the different event types at any node by comparison with the parental node (e). In this way emergences/losses of domains are inferred from the Dollo tree, while arrangements are inferred from the Fitch tree (f)

Similar articles

Cited by

References

    1. El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, Sonnhammer ELL, Hirsh L, Paladin L, Piovesan D, Tosatto SCE, Finn RD. The Pfam protein families database in 2019. Nucleic Acids Res. 2019;47(D1):427–32. doi: 10.1093/nar/gky995. - DOI - PMC - PubMed
    1. Wilson D, Pethica R, Zhou Y, Talbot C, Vogel C, Madera M, Chothia C, Gough J. SUPERFAMILY–sophisticated comparative genomics, data mining, visualization and phylogeny. Nucleic Acids Res. 2009;37(Database issue):380–6. doi: 10.1093/nar/gkn762. - DOI - PMC - PubMed
    1. Forslund K, Sonnhammer ELL. Evolution of protein domain architectures. In: Anisimova M, editor. Evolutionary Genomics: Statistical and Computational Methods, Volume 2. Totowa, NJ: Humana Press; 2012.
    1. Levitt M. Nature of the protein universe. Proc Natl Acad Sci USA. 2009;106(27):11079–84. doi: 10.1073/pnas.0905029106. - DOI - PMC - PubMed
    1. Apic G, Gough J, Teichmann Sa. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J Mol Biol. 2001;310(2):311–25. doi: 10.1006/jmbi.2001.4776. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources