Assessment of phylogenomic and orthology approaches for phylogenetic inference
- PMID: 17237036
- DOI: 10.1093/bioinformatics/btm015
Assessment of phylogenomic and orthology approaches for phylogenetic inference
Abstract
Motivation: Phylogenomics integrates the vast amount of phylogenetic information contained in complete genome sequences, and is rapidly becoming the standard for reliably inferring species phylogenies. There are, however, fundamental differences between the ways in which phylogenomic approaches like gene content, superalignment, superdistance and supertree integrate the phylogenetic information from separate orthologous groups. Furthermore, they all depend on the method by which the orthologous groups are initially determined. Here, we systematically compare these four phylogenomic approaches, in parallel with three approaches for large-scale orthology determination: pairwise orthology, cluster orthology and tree-based orthology.
Results: Including various phylogenetic methods, we apply a total of 54 fully automated phylogenomic procedures to the fungi, the eukaryotic clade with the largest number of sequenced genomes, for which we retrieved a golden standard phylogeny from the literature. Phylogenomic trees based on gene content show, relative to the other methods, a bias in the tree topology that parallels convergence in lifestyle among the species compared, indicating convergence in gene content.
Conclusions: Complete genomes are no guarantee for good or even consistent phylogenies. However, the large amounts of data in genomes enable us to carefully select the data most suitable for phylogenomic inference. In terms of performance, the superalignment approach, combined with restrictive orthology, is the most successful in recovering a fungal phylogeny that agrees with current taxonomic views, and allows us to obtain a high-resolution phylogeny. We provide solid support for what has grown to be a common practice in phylogenomics during its advance in recent years.
Supplementary information: Supplementary data are available at Bioinformatics online.
Similar articles
-
OrthologID: automation of genome-scale ortholog identification within a parsimony framework.Bioinformatics. 2006 Mar 15;22(6):699-707. doi: 10.1093/bioinformatics/btk040. Epub 2006 Jan 12. Bioinformatics. 2006. PMID: 16410324
-
From phylogenetics to phylogenomics: the evolutionary relationships of insect endosymbiotic gamma-Proteobacteria as a test case.Syst Biol. 2007 Feb;56(1):1-16. doi: 10.1080/10635150601109759. Syst Biol. 2007. PMID: 17366133
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Homology assessment and molecular sequence alignment.J Biomed Inform. 2006 Feb;39(1):18-33. doi: 10.1016/j.jbi.2005.11.005. Epub 2005 Dec 9. J Biomed Inform. 2006. PMID: 16380300 Review.
-
Phylogenomic networks.Trends Microbiol. 2011 Oct;19(10):483-91. doi: 10.1016/j.tim.2011.07.001. Epub 2011 Aug 3. Trends Microbiol. 2011. PMID: 21820313 Review.
Cited by
-
Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: an example from the fungal kingdom.BMC Evol Biol. 2007 Aug 9;7:134. doi: 10.1186/1471-2148-7-134. BMC Evol Biol. 2007. PMID: 17688684 Free PMC article.
-
Phylogenomics-based reconstruction of protozoan species tree.Evol Bioinform Online. 2011;7:107-21. doi: 10.4137/EBO.S6861. Epub 2011 Jul 31. Evol Bioinform Online. 2011. PMID: 21863127 Free PMC article.
-
Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut.Front Plant Sci. 2016 Apr 26;7:534. doi: 10.3389/fpls.2016.00534. eCollection 2016. Front Plant Sci. 2016. PMID: 27200012 Free PMC article.
-
Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution.J R Soc Interface. 2008 Feb 6;5(19):151-70. doi: 10.1098/rsif.2007.1047. J R Soc Interface. 2008. PMID: 17535793 Free PMC article. Review.
-
Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data.BMC Evol Biol. 2007 Nov 29;7:237. doi: 10.1186/1471-2148-7-237. BMC Evol Biol. 2007. PMID: 18047665 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources