Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 24:11:102.
doi: 10.1186/1471-2105-11-102.

Reconstructing genome trees of prokaryotes using overlapping genes

Affiliations

Reconstructing genome trees of prokaryotes using overlapping genes

Chih-Hsien Cheng et al. BMC Bioinformatics. .

Abstract

Background: Overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, they are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes. Based on this property, we have previously implemented a web server, named OGtree, that allows the user to reconstruct genome trees of some prokaryotes according to their pairwise OG distances. By analogy to the analyses of gene content and gene order, the OG distance between two genomes we defined was based on a measure of combining OG content (i.e., the normalized number of shared orthologous OG pairs) and OG order (i.e., the normalized OG breakpoint distance) in their whole genomes. A shortcoming of using the concept of breakpoints to define the OG distance is its inability to analyze the OG distance of multi-chromosomal genomes. In addition, the amount of overlapping coding sequences between some distantly related prokaryotic genomes may be limited so that it is hard to find enough OGs to properly evaluate their pairwise OG distances.

Results: In this study, we therefore define a new OG order distance that is based on more biologically accurate rearrangements (e.g., reversals, transpositions and translocations) rather than breakpoints and that is applicable to both uni-chromosomal and multi-chromosomal genomes. In addition, we expand the term "gene" to include both its coding sequence and regulatory regions so that two adjacent genes whose coding sequences or regulatory regions overlap with each other are considered as a pair of overlapping genes. This is because overlapping of regulatory regions of distinct genes suggests that the regulation of expression for these genes should be more or less interrelated. Based on these modifications, we have reimplemented our OGtree as a new web server, named OGtree2, and have also evaluated its accuracy of genome tree reconstruction on a testing dataset consisting of 21 Proteobacteria genomes. Our experimental results have finally shown that our current OGtree2 indeed outperforms its previous version OGtree, as well as another similar server, called BPhyOG, significantly in the quality of genome tree reconstruction, because the phylogenetic tree obtained by OGtree2 is greatly congruent with the reference tree that coincides with the taxonomy accepted by biologists for these Proteobacteria.

Conclusions: In this study, we have introduced a new web server OGtree2 at http://bioalgorithm.life.nctu.edu.tw/OGtree2.0/ that can serve as a useful tool for reconstructing more precise and robust genome trees of prokaryotes according to their overlapping genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Phylogenetic tree obtained from a trimmed alignment of 60 concatenated homologous proteins using maximum likelihood method with support values on its branches, which was adapted from [21].
Figure 2
Figure 2
Genome tree obtained using OGtree2 with UPGMA method. The numbers on the branches are jackknife support values from 1,000 replicates.
Figure 3
Figure 3
Phylogenetic tree constructed using BPhyOG [8,9].
Figure 4
Figure 4
Genome tree obtained using OGtree with UPGMA method [15].
Figure 5
Figure 5
Phylogenetic tree obtained from 16s rRNAs using the neighbor joining method. The numbers on the branches are bootstrap support values from 1,000 replicates.
Figure 6
Figure 6
Genome tree obtained using OGtree2 with NJ method. The numbers on the branches are jackknife support values from 1,000 replicates.
Figure 7
Figure 7
Genome tree obtained using OGtree2 with FM method. The numbers on the branches are jackknife support values from 1,000 replicates.
Figure 8
Figure 8
Basic pipeline of reconstructing genome trees using OG pairs.

Similar articles

Cited by

References

    1. Delsuc F, Brinkmann H, Philippe H. Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics. 2005;6:361–375. doi: 10.1038/nrg1603. - DOI - PubMed
    1. Snel B, Huynen MA, Dutilh BE. Genome trees and the nature of genome evolution. Annual Review of Microbiology. 2005;59:191–209. doi: 10.1146/annurev.micro.59.030804.121233. - DOI - PubMed
    1. Snel B, Bork P, Huynen MA. Genome phylogeny based on gene content. Nature Genetics. 1999;21:108–110. doi: 10.1038/5052. - DOI - PubMed
    1. Huson DH, Steel M. Phylogenetic trees based on gene content. Bioinformatics. 2004;20:2044–2049. doi: 10.1093/bioinformatics/bth198. - DOI - PubMed
    1. Blanchette M, Kunisawa T, Sankoff D. Gene order breakpoint evidence in animal mitochondrial phylogeny. Journal of Molecular Evolution. 1999;49:193–203. doi: 10.1007/PL00006542. - DOI - PubMed

Publication types

LinkOut - more resources