Haplotype inference by Pure Parsimony: a survey
- PMID: 20726791
- DOI: 10.1089/cmb.2009.0101
Haplotype inference by Pure Parsimony: a survey
Abstract
Given a set of genotypes from a population, the process of recovering the haplotypes that explain the genotypes is called haplotype inference. The haplotype inference problem under the assumption of pure parsimony consists in finding the smallest number of haplotypes that explain a given set of genotypes. This problem is NP-hard. The original formulations for solving the Haplotype Inference by Pure Parsimony (HIPP) problem were based on integer linear programming and branch-and-bound techniques. More recently, solutions based on Boolean satisfiability, pseudo-Boolean optimization, and answer set programming have been shown to be remarkably more efficient. HIPP can now be regarded as a feasible approach for haplotype inference, which can be competitive with other different approaches. This article provides an overview of the methods for solving the HIPP problem, including preprocessing, bounding techniques, and heuristic approaches. The article also presents an empirical evaluation of exact HIPP solvers on a comprehensive set of synthetic and real problem instances. Moreover, the bounding techniques to the exact problem are evaluated. The final section compares and discusses the HIPP approach with a well-established statistical method that represents the reference algorithm for this problem.
Similar articles
-
Mathematical properties and bounds on haplotyping populations by pure parsimony.Math Biosci. 2011 Jun;231(2):120-5. doi: 10.1016/j.mbs.2011.02.008. Epub 2011 Feb 24. Math Biosci. 2011. PMID: 21354185
-
An improved preprocessing algorithm for haplotype inference by pure parsimony.J Bioinform Comput Biol. 2014 Aug;12(4):1450020. doi: 10.1142/S0219720014500206. Epub 2014 Aug 1. J Bioinform Comput Biol. 2014. PMID: 25152045
-
A preprocessing procedure for haplotype inference by pure parsimony.IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1183-95. doi: 10.1109/TCBB.2010.125. IEEE/ACM Trans Comput Biol Bioinform. 2011. PMID: 21116044
-
An improved heuristic for haplotype inference.Gene. 2012 Oct 10;507(2):177-82. doi: 10.1016/j.gene.2012.06.032. Epub 2012 Jul 7. Gene. 2012. PMID: 22776766
-
Algorithms for inferring haplotypes.Genet Epidemiol. 2004 Dec;27(4):334-47. doi: 10.1002/gepi.20024. Genet Epidemiol. 2004. PMID: 15368348 Review.
Cited by
-
Disentangling homeologous contigs in allo-tetraploid assembly: application to durum wheat.BMC Bioinformatics. 2013;14 Suppl 15(Suppl 15):S15. doi: 10.1186/1471-2105-14-S15-S15. Epub 2013 Oct 15. BMC Bioinformatics. 2013. PMID: 24564644 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous