Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:3:2015.
doi: 10.1038/srep02015.

A daily-updated tree of (sequenced) life as a reference for genome research

Affiliations

A daily-updated tree of (sequenced) life as a reference for genome research

Hai Fang et al. Sci Rep. 2013.

Abstract

We report a daily-updated sequenced/species Tree Of Life (sTOL) as a reference for the increasing number of cellular organisms with their genomes sequenced. The sTOL builds on a likelihood-based weight calibration algorithm to consolidate NCBI taxonomy information in concert with unbiased sampling of molecular characters from whole genomes of all sequenced organisms. Via quantifying the extent of agreement between taxonomic and molecular data, we observe there are many potential improvements that can be made to the status quo classification, particularly in the Fungi kingdom; we also see that the current state of many animal genomes is rather poor. To augment the use of sTOL in providing evolutionary contexts, we integrate an ontology infrastructure and demonstrate its utility for evolutionary understanding on: nuclear receptors, stem cells and eukaryotic genomes. The sTOL (http://supfam.org/SUPERFAMILY/sTOL) provides a binary tree of (sequenced) life, and contributes to an analytical platform linking genome evolution, function and phenotype.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Schematic flowchart illustrating the reconstruction of sTOL.
Figure 2
Figure 2. The extent of agreement between the NCBI taxonomy and the molecular data.
The circular phylogram displays the NCBI taxonomy, wherein the nodes are labelled with one of three categories (‘Recovered’ in red, ‘Alternative’ in green, and ‘Others’ in blue) by colour-coding the edge above that node. The pie charts illustrate the clade-specific fractions of these three categories for either terminal tips or internal nodes. The clades illustrated in the right panel (from top to bottom) include ‘Cellular organisms’, ‘Eukaryota’, ‘Archaea’ and ‘Bacteria’, and in the bottom panel (from left to right) ‘Metazoa’, ‘Fungi’, and ‘Viridiplantae’.
Figure 3
Figure 3. Detailed inspection of disagreements with the NCBI taxonomy.
The colour-coded tree is the NCBI taxonomy with alternative topologies (inserted close by in black) suggested by the molecular data. (A) Metazoa clade as exemplified by an alternative (I, in green) suggested by the molecular data, which is likely due to the biased genome assembly. (B) Viridiplantae clade containing four alternative topologies (I ~ IV, in green) suggested by the molecular data. (C) Fungal clade wherein the three internal nodes (I ~ III, in green) are strongly supported by the molecular data as being different.
Figure 4
Figure 4. Presence-absence pattern of the nuclear receptor ligand-binding domain across the eukaryotic species tree of life.
The left panel illustrates the overview of the eukaryotic tree, with a branch (edge) highlighted in green if the domain can be found in all genomes under the clade attached to the branch. The right panel is the zoomed-in version of the kingdom Viridiplantae (plants), which further contains two clades, embryophytes (land plants) and chlorophyta (green algae).
Figure 5
Figure 5. A list of domains annotated by stem cell maintenance and their distribution over the three kingdoms in eukaryotic evolution.
The diagram in the top panel shows the paths covering three kingdoms. The bottom panel lists the details of their presence (1) and absence (0) patterns at the major branching points of eukaryotic evolution. The last row tells how many distinct domains (i.e., superfamilies) are related to stem cell maintenance.

Similar articles

Cited by

References

    1. Mardis E. R. A decade's perspective on DNA sequencing technology. Nature 470, 198–203 (2011). - PubMed
    1. Metzker M. L. Sequencing technologies - the next generation. Nat Rev Genet 11, 31–46 (2010). - PubMed
    1. Eisen J. A. & Fraser C. M. Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707 (2003). - PubMed
    1. Snel B., Bork P. & Huynen M. A. Genome phylogeny based on gene content. Nat Genet 21, 108–110 (1999). - PubMed
    1. Tekaia F., Lazcano A. & Dujon B. The genomic tree as revealed from whole proteome comparisons. Genome Res 9, 550–557 (1999). - PMC - PubMed

Publication types

LinkOut - more resources