Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Mar;6(3):474-81.
doi: 10.1093/gbe/evu031.

Archaeal "dark matter" and the origin of eukaryotes

Affiliations

Archaeal "dark matter" and the origin of eukaryotes

Tom A Williams et al. Genome Biol Evol. 2014 Mar.

Abstract

Current hypotheses about the history of cellular life are mainly based on analyses of cultivated organisms, but these represent only a small fraction of extant biodiversity. The sequencing of new environmental lineages therefore provides an opportunity to test, revise, or reject existing ideas about the tree of life and the origin of eukaryotes. According to the textbook three domains hypothesis, the eukaryotes emerge as the sister group to a monophyletic Archaea. However, recent analyses incorporating better phylogenetic models and an improved sampling of the archaeal domain have generally supported the competing eocyte hypothesis, in which core genes of eukaryotic cells originated from within the Archaea, with important implications for eukaryogenesis. Given this trend, it was surprising that a recent analysis incorporating new genomes from uncultivated Archaea recovered a strongly supported three domains tree. Here, we show that this result was due in part to the use of a poorly fitting phylogenetic model and also to the inclusion by an automated pipeline of genes of putative bacterial origin rather than nucleocytosolic versions for some of the eukaryotes analyzed. When these issues were resolved, analyses including the new archaeal lineages placed core eukaryotic genes within the Archaea. These results are consistent with a number of recent studies in which improved archaeal sampling and better phylogenetic models agree in supporting the eocyte tree over the three domains hypothesis.

Keywords: Tree of Life; eukaryogenesis; phylogenetics; “dark matter”.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Bayesian phylogenies inferred from the dark matter supermatrix of Rinke et al. (2013). (a) The consensus tree inferred under the best-fitting LG single matrix model. This is a three domains (Woese et al. 1990) tree, with maximal support (PP = 1) for archaeal monophyly. (b) The tree inferred under the CAT + GTR model for this data set does not correspond to any published hypothesis on the tree of life, with the Archaea emerging from within a paraphyletic eukaryotic clade; this topology is likely due to contamination of the eukaryotic data set with genes of mitochondrial and plastid origin. Our interpretation is based on a root for the tree of life within the Bacteria (Cavalier-Smith 2006; Lake et al. 2009), or on the bacterial stem (Gogarten et al. 1989; Iwabe et al. 1989; Dagan et al. 2010). Branch lengths are proportional to expected numbers of substitutions per site, and support values are Bayesian posterior probabilities.
F<sc>ig</sc>. 2.—
Fig. 2.—
Bayesian phylogenies inferred from the dark matter data set after eukaryotic genes of bacterial origin had been replaced with their nucleocytosolic homologues. (a) Inference under the LG model recovers a weakly supported three domains tree, with support for archaeal monophyly reduced to 0.5. (b) The better-fitting CAT + GTR model recovers a strongly supported eocyte tree, with core eukaryotic genes forming a clade with the TACK superphylum of Archaea with maximum support (PP = 1). Our interpretation is based on a root for the tree of life within the Bacteria (Cavalier-Smith 2006; Lake et al. 2009), or on the bacterial stem (Gogarten et al. 1989; Iwabe et al. 1989; Dagan et al. 2010). Branch lengths are proportional to expected numbers of substitutions per site, and support values are Bayesian posterior probabilities.
F<sc>ig</sc>. 3.—
Fig. 3.—
Bayesian concatenated protein phylogeny inferred from a congruent set of 29 genes conserved in Bacteria, Archaea, and eukaryotes. The eukaryotes emerge from within the TACK superphylum of Archaea with maximal support. There is strong support (PP = 0.99) for the monophyly of Nanoarchaeum equitans with the newly sequenced “DPANN” Archaea. These are the 29 genes from Williams et al. (2012), updated to include the new archaeal sequences from the GEBA project (Rinke et al. 2013). The tree was inferred using the CAT + GTR model in PhyloBayes MPI (Lartillot et al. 2013). Our interpretation is based on a root for the tree of life within the Bacteria (Cavalier-Smith 2006; Lake et al. 2009), or on the bacterial stem (Gogarten et al. 1989; Iwabe et al. 1989; Dagan et al. 2010). Branch lengths are proportional to expected numbers of substitutions per site, and support values are Bayesian posterior probabilities.

References

    1. Baker BJ, et al. Lineages of acidophilic archaea revealed by community genomic analysis. Science. 2006;314:1933–1935. - PubMed
    1. Bollback JP. Bayesian model adequacy and choice in phylogenetics. Mol Biol Evol. 2002;19:1171–1180. - PubMed
    1. Bradley RK, et al. Fast statistical alignment. PLoS Comput Biol. 2009;5:e1000392. - PMC - PubMed
    1. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P. Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol. 2008;6:245–252. - PubMed
    1. Brochier-Armanet C, Forterre P, Gribaldo S. Phylogeny and evolution of the Archaea: one hundred genomes later. Curr Opin Microbiol. 2011;14:274–281. - PubMed

Publication types

LinkOut - more resources