Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 13;16(1):30.
doi: 10.1186/s12915-018-0500-0.

Formation of chimeric genes with essential functions at the origin of eukaryotes

Affiliations

Formation of chimeric genes with essential functions at the origin of eukaryotes

Raphaël Méheust et al. BMC Biol. .

Abstract

Background: Eukaryotes evolved from the symbiotic association of at least two prokaryotic partners, and a good deal is known about the timings, mechanisms, and dynamics of these evolutionary steps. Recently, it was shown that a new class of nuclear genes, symbiogenetic genes (S-genes), was formed concomitant with endosymbiosis and the subsequent evolution of eukaryotic photosynthetic lineages. Understanding their origins and contributions to eukaryogenesis would provide insights into the ways in which cellular complexity has evolved.

Results: Here, we show that chimeric nuclear genes (S-genes), built from prokaryotic domains, are critical for explaining the leap forward in cellular complexity achieved during eukaryogenesis. A total of 282 S-gene families contributed solutions to many of the challenges faced by early eukaryotes, including enhancing the informational machinery, processing spliceosomal introns, tackling genotoxicity within the cell, and ensuring functional protein interactions in a larger, more compartmentalized cell. For hundreds of S-genes, we confirmed the origins of their components (bacterial, archaeal, or generally prokaryotic) by maximum likelihood phylogenies. Remarkably, Bacteria contributed nine-fold more S-genes than Archaea, including a two-fold greater contribution to informational functions. Therefore, there is an additional, large bacterial contribution to the evolution of eukaryotes, implying that fundamental eukaryotic properties do not strictly follow the traditional informational/operational divide for archaeal/bacterial contributions to eukaryogenesis.

Conclusion: This study demonstrates the extent and process through which prokaryotic fragments from bacterial and archaeal genes inherited during eukaryogenesis underly the creation of novel chimeric genes with important functions.

Keywords: Chimeric genes; Endosymbiosis; Eukaryogenesis; Evolutionary genomics; Evolutionary transition.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Putative phylogeny of eukaryotes, based on Derelle et al. [82], that shows the distribution of 573 S-gene families. Family evolution reconstruction was performed using Dollo parsimony. The four boxes correspond to the number of families involved in metabolism (red), information storage and processing (blue), cellular processes and signaling (green), and poorly characterized processes (white)
Fig. 2
Fig. 2
Functional annotation of the 573 S-genes based on COG categories. S-gene families were divided into early (S-genes found in both Opimoda and Diphoda, 286 gene families, in blue), intermediate (S-genes found either in Opimoda or Diphoda, 101 gene families, in pink), and lineage specific (S-genes found in one eukaryotic supergroups, 186 gene families, in green) (COG category definitions can be found here: http://eggnogdb.embl.de/download/eggnog_4.5/COG_functional_categories.txt)
Fig. 3
Fig. 3
Mapping of the functions of 573 S-genes in a eukaryotic cell (figure adapted from de Duve [83]). Numbers in red correspond to functions containing essential S-genes in yeast
Fig. 4
Fig. 4
Hierarchical clustering of S-gene families according to their component origins. The heatmap represents the ratio of genes in a given family (columns) that have at least one component of a given origin (eukaryotic, archaeal, bacterial or prokaryotic; the rows). White lines correspond to the absence of a component from a given origin in every gene in the given S-gene family. The colored lines correspond to the presence of at least one component of the given origin in a given percentage of genes in the given S-gene family (red lines denote that all (100%) genes contain a given origin component). The first colored top bar indicates the functional annotation. The black bars in the second colored top bar indicate the reclassified S-genes after applying the HMM-profile procedure. Cluster 1 roughly corresponds to 60 S-genes with only prokaryotic components (PROK-PROK), clusters 2 and 7 roughly correspond to 203 S-genes with only bacterial components (BAC-BAC), cluster 3 roughly corresponds to 122 S-genes with bacterial and eukaryotic components (BAC-EUK), cluster 4 roughly corresponds to 62 S-genes with prokaryotic and eukaryotic components (PROK-EUK), cluster 5 roughly corresponds to 62 S-genes with prokaryotic and bacterial components (PROK-BAC), cluster 6 roughly corresponds to 8 S-genes with bacterial and archaeal components (ARC-BAC), cluster 8 roughly corresponds to 23 S-genes with archaeal and eukaryotic components (ARC-EUK), cluster 9 roughly corresponds to 7 S-genes with only archaeal components (ARC-ARC), and cluster 10 roughly corresponds to 4 S-genes with prokaryotic and archaeal components (ARC-PROK)
Fig. 5
Fig. 5
S-gene family 12,448. a Component architecture and phylogenetic tree of S-gene family 12,448. Family 12,448 is composed of two components (Flavodoxin and TYW1) of bacterial and archaeal origins, respectively, according to our BLASTp taxonomic assignment (for the phylogenetic tree, blue: SAR, red: Archaeplastida, purple: Opisthokonta, cyan: Haptophyta, yellow: Cryptophyta, blue-green: Discoba) (17 sequences, 407 sites, model LG + G4, 1000 ultrafast bootstraps). b Maximum likelihood (ML) phylogenetic tree of the flavodoxin component (green: Eukarya, blue: Archaea, red: Bacteria, black circle: bootstraps > 80%) (117 sequences, 164 sites, model LG + G4, 1000 ultrafast bootstraps). c ML phylogenetic tree of the TYW1 component (green: Eukarya, blue: Archaea, red: Bacteria, black circle: bootstraps > 80%) (117 sequences, 246 sites, model LG + I + G4, 1000 ultrafast bootstraps)

References

    1. Méheust R, Zelzion E, Bhattacharya D, Lopez P, Bapteste E. Protein networks identify novel symbiogenetic genes resulting from plastid endosymbiosis. Proc Natl Acad Sci U S A. 2016;113:3579–3584. doi: 10.1073/pnas.1517551113. - DOI - PMC - PubMed
    1. Kaessmann H. Origins, evolution, and phenotypic impact of new genes. Genome Res. 2010;20:1313–1326. doi: 10.1101/gr.101386.109. - DOI - PMC - PubMed
    1. Makarova KS, Wolf YI, Mekhedov SL, Mirkin BG, Koonin EV. Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell. Nucleic Acids Res. 2005;33:4626–4638. doi: 10.1093/nar/gki775. - DOI - PMC - PubMed
    1. McLysaght A, Guerzoni D. New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation. Philos Trans R Soc Lond B Biol Sci. 2015;370:20140332. doi: 10.1098/rstb.2014.0332. - DOI - PMC - PubMed
    1. Kawai H, Kanegae T, Christensen S, Kiyosue T, Sato Y, Imaizumi T, et al. Responses of ferns to red light are mediated by an unconventional photoreceptor. Nature. 2003;421:287–290. doi: 10.1038/nature01310. - DOI - PubMed

Publication types

LinkOut - more resources