Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 7;19(1):75.
doi: 10.1186/s13059-018-1454-9.

Hundreds of novel composite genes and chimeric genes with bacterial origins contributed to haloarchaeal evolution

Affiliations

Hundreds of novel composite genes and chimeric genes with bacterial origins contributed to haloarchaeal evolution

Raphaël Méheust et al. Genome Biol. .

Abstract

Background: Haloarchaea, a major group of archaea, are able to metabolize sugars and to live in oxygenated salty environments. Their physiology and lifestyle strongly contrast with that of their archaeal ancestors. Amino acid optimizations, which lowered the isoelectric point of haloarchaeal proteins, and abundant lateral gene transfers from bacteria have been invoked to explain this deep evolutionary transition. We use network analyses to show that the evolution of novel genes exclusive to Haloarchaea also contributed to the evolution of this group.

Results: We report the creation of 320 novel composite genes, both early in the evolution of Haloarchaea during haloarchaeal genesis and later in diverged haloarchaeal groups. One hundred and twenty-six of these novel composite genes derived from genetic material from bacterial genomes. These latter genes, largely involved in metabolic functions but also in oxygenic lifestyle, constitute a different gene pool from the laterally acquired bacterial genes formerly identified. These novel composite genes were likely advantageous for their hosts, since they show significant residence times in haloarchaeal genomes-consistent with a long phylogenetic history involving vertical descent and lateral gene transfer-and encode proteins with optimized isoelectric points.

Conclusions: Overall, our work encourages a systematic search for composite genes across all archaeal major groups, in order to better understand the origins of novel prokaryotic genes, and in order to test to what extent archaea might have adjusted their lifestyles by incorporating and recycling laterally acquired bacterial genetic fragments into new archaeal genes.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Hierarchical clustering of composite genes families according to their component origins (as assigned by BLAST). The heatmap represents the ratio of genes in a given family (columns) which have at least one component of a given origin (haloarchaeal, archaeal, bacterial or prokaryotic, rows). A white tick corresponds to the absence of components from a given origin in every gene in a given composite gene family. Colored ticks correspond to the presence of at least one component of a given origin at a given percentage (red for 100% of the genes in a composite gene family). The heatmap is hierarchically clustered by gene families. The colored top bar indicates the functional annotation of the composite gene families according to COG categories (red: metabolism, blue: information storage and processing, green: cellular processes and signaling, white: poorly characterized). The Euclidean distance and the complete linkage method were used for the hierarchical clustering
Fig. 2
Fig. 2
Barplot of functional annotation of the 126 ChiC-gene families (blue) and other composite families (red). D: Cell cycle control, cell division, chromosome partitioning, A: RNA processing and modification, C: Energy production and conversion, M: Cell wall/membrane/envelope biogenesis, B: Chromatin structure and dynamics, E: Amino acid transport and metabolism, N: Cell motility, J: Translation, ribosomal structure and biogenesis, F: Nucleotide transport and metabolism, O: Post-translational modification, protein turnover, and chaperones, K: Transcription, G: Carbohydrate transport and metabolism, T: Signal transduction mechanisms, L: Replication, recombination and repair, H: Coenzyme transport and metabolism, U: Intracellular trafficking, secretion, and vesicular transport, I: Lipid transport and metabolism, V: Defence mechanisms, P: Inorganic ion transport and metabolism, W: Extracellular structures, Q: Secondary metabolites biosynthesis, transport, and catabolism, Z: Cytoskeleton
Fig. 3
Fig. 3
Domain architecture and origin of the 21 ChiC-protein families involved in carbohydrate transport and metabolism (red: Bacteria, blue: Archaea, orange: Prokaryote)
Fig. 4
Fig. 4
Distribution of the 320 composite gene families in Haloarchaea. The heatmap represents the presence (black line) or absence (white line) of a given composite gene family in Haloarchaea genomes (each line represents a given genome, each column represents a gene family). Haloarchaea genomes are colored with respect to their classification into major clades according to the study by [26] (red: clade B, blue: clade A, green: clade C, yellow: clade D, and black: unassigned). The colored horizontal top bar (a) indicates the mean percentage of protein identity of each gene family (red > 80%, orange > 60%, yellow > 40%, white > 25%). The colored horizontal top bar (b) indicates the type of composite family (red: clusters 3, 4, 6, and 9, blue: clusters 1 and 10, white: clusters 2, 5, 7, and 8). The colored horizontal top bar (c) indicates the functional annotation of the gene families according to COG categories (red: metabolism, blue: information storage and processing, green: cellular processes and signaling, white: poorly characterized). A hierarchical clustering has been performed both on columns and rows using the Jaccard distance and a complete linkage method. The hierarchical clustering of the protein families (columns) highlights two distinct sets of proteins, proteins that are widespread (2) and those with a sparse distribution (1)
Fig. 5
Fig. 5
a Boxplots showing the distribution of isoelectric points of proteins according to their origins and their types. The boxplot indicates the median line, first and third quartiles. Outliers that are 1.5× above the upper quartile or below the lower quartile are indicated as dots. b Boxplots showing the distribution of the isoelectric points of components originated from bacteria. Bacterial components correspond to bacterial genes which aligned with the ChiC-gene components assigned as of bacterial origin. The boxplot indicates the median line, first and third quartiles. Outliers that are 1.5× above the upper quartile or below the lower quartile are indicated as dots

Similar articles

Cited by

References

    1. López-García P, Zivanovic Y, Deschamps P, Moreira D. Bacterial gene import and mesophilic adaptation in archaea. Nat Rev Microbiol. 2015;13:447–456. doi: 10.1038/nrmicro3485. - DOI - PMC - PubMed
    1. Grant WD. Life at low water activity. Philos Trans R Soc B Biol Sci. 2004;359:1249–1267. doi: 10.1098/rstb.2004.1502. - DOI - PMC - PubMed
    1. Roesser M, Müller V. Osmoadaptation in bacteria and archaea: common principles and differences. Environ Microbiol. 2001;3:743–754. doi: 10.1046/j.1462-2920.2001.00252.x. - DOI - PubMed
    1. Kennedy SP, Ng WV, Salzberg SL, Hood L, DasSarma S. Understanding the adaptation of Halobacterium species NRC-1 to its extreme environment through computational analysis of its genome sequence. Genome Res. 2001;11:1641–1650. doi: 10.1101/gr.190201. - DOI - PMC - PubMed
    1. Nelson-Sathi S, Dagan T, Landan G, Janssen A, Steel M, McInerney JO, et al. Acquisition of 1,000 eubacterial genes physiologically transformed a methanogen at the origin of Haloarchaea. Proc Natl Acad Sci U S A 2012;109. 10.1073/pnas.1209119109. - PMC - PubMed

Publication types

LinkOut - more resources