Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(11):e1002701.
doi: 10.1371/journal.pcbi.1002701. Epub 2012 Nov 15.

This Déjà vu feeling--analysis of multidomain protein evolution in eukaryotic genomes

Affiliations

This Déjà vu feeling--analysis of multidomain protein evolution in eukaryotic genomes

Christian M Zmasek et al. PLoS Comput Biol. 2012.

Abstract

Evolutionary innovation in eukaryotes and especially animals is at least partially driven by genome rearrangements and the resulting emergence of proteins with new domain combinations, and thus potentially novel functionality. Given the random nature of such rearrangements, one could expect that proteins with particularly useful multidomain combinations may have been rediscovered multiple times by parallel evolution. However, existing reports suggest a minimal role of this phenomenon in the overall evolution of eukaryotic proteomes. We assembled a collection of 172 complete eukaryotic genomes that is not only the largest, but also the most phylogenetically complete set of genomes analyzed so far. By employing a maximum parsimony approach to compare repertoires of Pfam domains and their combinations, we show that independent evolution of domain combinations is significantly more prevalent than previously thought. Our results indicate that about 25% of all currently observed domain combinations have evolved multiple times. Interestingly, this percentage is even higher for sets of domain combinations in individual species, with, for instance, 70% of the domain combinations found in the human genome having evolved independently at least once in other species. We also show that previous, much lower estimates of this rate are most likely due to the small number and biased phylogenetic distribution of the genomes analyzed. The process of independent emergence of identical domain combination is widespread, not limited to domains with specific functional categories. Besides data from large-scale analyses, we also present individual examples of independent domain combination evolution. The surprisingly large contribution of parallel evolution to the development of the domain combination repertoire in extant genomes has profound consequences for our understanding of the evolution of pathways and cellular processes in eukaryotes and for comparative functional genomics.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Overview of the current model of eukaryote evolution.
The six “supergroups”—Opisthokonta, Amoebozoa, Archaeplastida, Chromalveolata, Rhizaria, and Excavata—are shown (the placement of Excavata is under debate) –, , , , , , .
Figure 2
Figure 2. Numbers of domains and domain combinations in select species.
The colors used correspond to the colors in Figure 1 (orange for Opisthokonta, red for Amoebozoa, green for Archaeplastida, blue for Chromalveolata, and purple for Excavata).
Figure 3
Figure 3. Average ratios between the numbers of domain combinations and (number of domains)2 for select groups of organisms.
Standard deviations are shown as error bars. The asterix is used to indicate the results for Deuterostoma under exclusion of the amphioxus Branchiostoma floridae genome. The colors used correspond to the colors in Figure 1.
Figure 4
Figure 4. Clade-specific domains and domain combinations.
This figure shows the numbers of clade-specific domain combinations (black numbers after the slash) and core domain combinations (black numbers before the slash) for select clades. Below these are the numbers of clade-specific domains (gray numbers after the slash) and core domains (gray numbers before the slash). Numbers in brackets refer to domain combination counts under exclusion of the amphioxus Branchiostoma floridae genome. The numbers of analyzed genomes are shown in parentheses below the clade names. For example, the 19 analyzed vertebrate genomes contain 1,416 clade-specific domain combinations, 102 of which are found in each of the 19 analyzed genomes. These 19 genomes also contain 380 clade-specific domains, out of which 67 are present in each vertebrate genome. Ambulacraria is a clade of deuterostomes that includes echinoderms and hemichordates. To facilitate comparison of different taxonomic levels, established phyla are shown with a light-blue background, whereas super-phyla have a light-purple background. This figure was made using the “gathering” cutoffs provided by Pfam. For a detailed description of parameters, see Materials and Methods. Complete counts are shown in Table S6.
Figure 5
Figure 5. Taxonomic distribution of domain combinations.
A shows the distribution of the 34,778 distinct domain combinations encountered in this work over the five eukaryotic “supergroups” analyzed, plus Thecamonas trahens (see Figure 1). B shows the distribution of the 14,704 Holozoa-specific domain combinations over various groups of Holozoa. See Table 2 and Table S6 for detailed numbers.
Figure 6
Figure 6. Parallel evolution of the K Homology (KH)∼DEAD/DEAH box helicase combination between Bilateria and Micromonas (a group of green algae).
The complete diagram on which this simplified version is based is available in the supplementary materials.
Figure 7
Figure 7. Independent domain combination evolution under an unweighted parsimony model.
The histogram in A shows the sum for reappearing domains versus the number of reappearances. B is a comparison between the sum of domains that appear only once versus the sum of domains that appear more than once.
Figure 8
Figure 8. Normalized rates of independent domain combination evolution.
Normalized (by the number of genomes) sums of independently evolved domain combinations across major splits on the eukaryotic tree of life are shown. “Opistho” stands for Opisthokonta and “Choano” stands for Chanoflagellatae. Ambulacraria is a clade that includes echinoderms and hemichordates.
Figure 9
Figure 9. Parallel evolution of the NACHT∼Ankyrin combination between Neoptera (winged insects) and fungi.
The complete diagram on which this simplified version is based is available in the supplementary materials (which explains that both major groups of fungi, Basidiomycota and Ascomycota, have one independent domain fusion event each).
Figure 10
Figure 10. Parallel evolution of the Amidohydrolase∼Aspartate/ornithine carbamoyltransferase combination between Metazoa and Dictyostelium.
The complete diagram on which this simplified version is based is available in the supplementary materials.

Similar articles

Cited by

References

    1. Moore AD, Björklund ÅK, Ekman D, Bornberg-Bauer E, Elofsson A (2008) Arrangements in the modular evolution of proteins. Trends Biochem Sci 33: 444–451. - PubMed
    1. Itoh M, Nacher JC, Kuma K-i, Goto S, Kanehisa M (2007) Evolutionary history and functional implications of protein domains and their combinations in eukaryotes. Genome Biol 8: R121. - PMC - PubMed
    1. Peisajovich SG, Garbarino JE, Wei P, Lim WA (2010) Rapid diversification of cell signaling phenotypes by modular domain recombination. Science 328: 368–372. - PMC - PubMed
    1. Jin J, Xie X, Chen C, Park JG, Stark C, et al. (2009) Eukaryotic protein domains as functional units of cellular evolution. Sci Signal 2: ra76. - PubMed
    1. Zmasek CM, Godzik A (2011) Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires. Genome Biol 12: R4. - PMC - PubMed

Publication types