On the inference of parsimonious indel evolutionary scenarios
- PMID: 16960972
- DOI: 10.1142/s0219720006002168
On the inference of parsimonious indel evolutionary scenarios
Abstract
Given a multiple alignment of orthologous DNA sequences and a phylogenetic tree for these sequences, we investigate the problem of reconstructing a most parsimonious scenario of insertions and deletions capable of explaining the gaps observed in the alignment. This problem, called the Indel Parsimony Problem, is a crucial component of the problem of ancestral genome reconstruction, and its solution provides valuable information to many genome functional annotation approaches. We first show that the problem is NP-complete. Second, we provide an algorithm, based on the fractional relaxation of an integer linear programming formulation. The algorithm is fast in practice, and the solutions it produces are, in most cases, provably optimal. We describe a divide-and-conquer approach that makes it possible to solve very large instances on a simple desktop machine, while retaining guaranteed optimality. Our algorithms are tested and shown efficient and accurate on a set of 1.8 Mb mammalian orthologous sequences in the CFTR region.
Similar articles
-
Exact and heuristic algorithms for the Indel Maximum Likelihood Problem.J Comput Biol. 2007 May;14(4):446-61. doi: 10.1089/cmb.2007.A006. J Comput Biol. 2007. PMID: 17572023
-
Chromosome structures: reduction of certain problems with unequal gene content and gene paralogs to integer linear programming.BMC Bioinformatics. 2017 Dec 6;18(1):537. doi: 10.1186/s12859-017-1944-x. BMC Bioinformatics. 2017. PMID: 29212445 Free PMC article.
-
Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes.BMC Evol Biol. 2003 Jan 6;3:2. doi: 10.1186/1471-2148-3-2. Epub 2003 Jan 6. BMC Evol Biol. 2003. PMID: 12515582 Free PMC article.
-
Bayesian coestimation of phylogeny and sequence alignment.BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83. BMC Bioinformatics. 2005. PMID: 15804354 Free PMC article.
-
Negative information for building phylogenies.Recent Pat DNA Gene Seq. 2013 Aug;7(2):128-36. doi: 10.2174/1872215611307020007. Recent Pat DNA Gene Seq. 2013. PMID: 22974263 Review.
Cited by
-
Single-character insertion-deletion model preserves long indels in ancestral sequence reconstruction.BMC Bioinformatics. 2024 Dec 2;25(1):370. doi: 10.1186/s12859-024-05986-1. BMC Bioinformatics. 2024. PMID: 39617897 Free PMC article.
-
Genome-wide nucleotide-level mammalian ancestor reconstruction.Genome Res. 2008 Nov;18(11):1829-43. doi: 10.1101/gr.076521.108. Epub 2008 Oct 10. Genome Res. 2008. PMID: 18849525 Free PMC article.
-
Phylo: a citizen science approach for improving multiple sequence alignment.PLoS One. 2012;7(3):e31362. doi: 10.1371/journal.pone.0031362. Epub 2012 Mar 7. PLoS One. 2012. PMID: 22412834 Free PMC article.
-
Probabilistic phylogenetic inference with insertions and deletions.PLoS Comput Biol. 2008 Sep 19;4(9):e1000172. doi: 10.1371/journal.pcbi.1000172. PLoS Comput Biol. 2008. PMID: 18787703 Free PMC article.
-
General continuous-time Markov model of sequence evolution via insertions/deletions: are alignment probabilities factorable?BMC Bioinformatics. 2016 Aug 11;17:304. doi: 10.1186/s12859-016-1105-7. BMC Bioinformatics. 2016. PMID: 27638547 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous