Inferring phylogeny from whole genomes
- PMID: 17237078
- DOI: 10.1093/bioinformatics/btl296
Inferring phylogeny from whole genomes
Abstract
Motivation: Inferring species phylogenies with a history of gene losses and duplications is a challenging and an important task in computational biology. This problem can be solved by duplication-loss models in which the primary step is to reconcile a rooted gene tree with a rooted species tree. Most modern methods of phylogenetic reconstruction (from sequences) produce unrooted gene trees. This limitation leads to the problem of transforming unrooted gene tree into a rooted tree, and then reconciling rooted trees. The main questions are 'What about biological interpretation of choosing rooting?', 'Can we find efficiently the optimal rootings?', 'Is the optimal rooting unique?'.
Results: In this paper we present a model of reconciling unrooted gene tree with a rooted species tree, which is based on a concept of choosing rooting which has minimal reconciliation cost. Our analysis leads to the surprising property that all the minimal rootings have identical distributions of gene duplications and gene losses in the species tree. It implies, in our opinion, that the concept of an optimal rooting is very robust, and thus biologically meaningful. Also, it has nice computational properties. We present a linear time and space algorithm for computing optimal rooting(s). This algorithm was used in two different ways to reconstruct the optimal species phylogeny of five known yeast genomes from approximately 4700 gene trees. Moreover, we determined locations (history) of all gene duplications and gene losses in the final species tree. It is interesting to notice that the top five species trees are the same for both methods.
Availability: Software and documentation are freely available from http://bioputer.mimuw.edu.pl/~gorecki/urec
Similar articles
-
URec: a system for unrooted reconciliation.Bioinformatics. 2007 Feb 15;23(4):511-2. doi: 10.1093/bioinformatics/btl634. Epub 2006 Dec 20. Bioinformatics. 2007. PMID: 17182699
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Assessment of phylogenomic and orthology approaches for phylogenetic inference.Bioinformatics. 2007 Apr 1;23(7):815-24. doi: 10.1093/bioinformatics/btm015. Epub 2007 Jan 19. Bioinformatics. 2007. PMID: 17237036
-
Models, algorithms and programs for phylogeny reconciliation.Brief Bioinform. 2011 Sep;12(5):392-400. doi: 10.1093/bib/bbr045. Brief Bioinform. 2011. PMID: 21949266 Review.
-
Phylogenetic analyses of parasites in the new millennium.Adv Parasitol. 2006;63:1-124. doi: 10.1016/S0065-308X(06)63001-7. Adv Parasitol. 2006. PMID: 17134652 Review.
Cited by
-
Genomic duplication problems for unrooted gene trees.BMC Genomics. 2016 Jan 11;17 Suppl 1(Suppl 1):15. doi: 10.1186/s12864-015-2308-4. BMC Genomics. 2016. PMID: 26818591 Free PMC article.
-
Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem.BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S14. doi: 10.1186/1471-2105-13-S10-S14. BMC Bioinformatics. 2012. PMID: 22759419 Free PMC article.
-
Maximum likelihood models and algorithms for gene tree evolution with duplications and losses.BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S15. doi: 10.1186/1471-2105-12-S1-S15. BMC Bioinformatics. 2011. PMID: 21342544 Free PMC article.
-
STRIDE: Species Tree Root Inference from Gene Duplication Events.Mol Biol Evol. 2017 Dec 1;34(12):3267-3278. doi: 10.1093/molbev/msx259. Mol Biol Evol. 2017. PMID: 29029342 Free PMC article.
-
Refining discordant gene trees.BMC Bioinformatics. 2014;15 Suppl 13(Suppl 13):S3. doi: 10.1186/1471-2105-15-S13-S3. Epub 2014 Nov 13. BMC Bioinformatics. 2014. PMID: 25434729 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases