Fast and accurate phylogenetic reconstruction from high-resolution whole-genome data and a novel robustness estimator
- PMID: 21899420
- DOI: 10.1089/cmb.2011.0114
Fast and accurate phylogenetic reconstruction from high-resolution whole-genome data and a novel robustness estimator
Abstract
The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis. We describe a fast and accurate algorithm for rearrangement analysis that scales up, in both time and accuracy, to modern high-resolution genomic data. We also describe a novel approach to estimate the robustness of results-an equivalent to the bootstrapping analysis used in sequence-based phylogenetic reconstruction. We present the results of extensive testing on both simulated and real data showing that our algorithm returns very accurate results, while scaling linearly with the size of the genomes and cubically with their number. We also present extensive experimental results showing that our approach to robustness testing provides excellent estimates of confidence, which, moreover, can be tuned to trade off thresholds between false positives and false negatives. Together, these two novel approaches enable us to attack heretofore intractable problems, such as phylogenetic inference for high-resolution vertebrate genomes, as we demonstrate on a set of six vertebrate genomes with 8,380 syntenic blocks. A copy of the software is available on demand.
Similar articles
-
Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes.Pac Symp Biocomput. 2013:285-96. Pac Symp Biocomput. 2013. PMID: 23424133 Free PMC article.
-
Techniques for multi-genome synteny analysis to overcome assembly limitations.Genome Inform. 2006;17(2):152-61. Genome Inform. 2006. PMID: 17503388
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Phylogenetic understanding of clonal populations in an era of whole genome sequencing.Infect Genet Evol. 2009 Sep;9(5):1010-9. doi: 10.1016/j.meegid.2009.05.014. Epub 2009 May 27. Infect Genet Evol. 2009. PMID: 19477301 Review.
-
Tree disagreement: measuring and testing incongruence in phylogenies.J Biomed Inform. 2006 Feb;39(1):86-102. doi: 10.1016/j.jbi.2005.08.008. Epub 2005 Sep 28. J Biomed Inform. 2006. PMID: 16243006 Review.
Cited by
-
Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes.Pac Symp Biocomput. 2013:285-96. Pac Symp Biocomput. 2013. PMID: 23424133 Free PMC article.
-
Quantifying homologous replacement of loci between haloarchaeal species.Genome Biol Evol. 2012;4(12):1223-44. doi: 10.1093/gbe/evs098. Genome Biol Evol. 2012. PMID: 23160063 Free PMC article.
-
Evaluating impacts of syntenic block detection strategies on rearrangement phylogeny using Mycobacterium tuberculosis isolates.Bioinformatics. 2023 Jan 1;39(1):btad024. doi: 10.1093/bioinformatics/btad024. Bioinformatics. 2023. PMID: 36637196 Free PMC article.
-
The evolution of genomic instability in the obligate endosymbionts of whiteflies.Genome Biol Evol. 2013;5(5):783-93. doi: 10.1093/gbe/evt044. Genome Biol Evol. 2013. PMID: 23542079 Free PMC article.
-
Phylogenetic analysis of genome rearrangements among five mammalian orders.Mol Phylogenet Evol. 2012 Dec;65(3):871-82. doi: 10.1016/j.ympev.2012.08.008. Epub 2012 Aug 21. Mol Phylogenet Evol. 2012. PMID: 22929217 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous