Identifying mutation regions for closely related individuals without a known pedigree
- PMID: 22731852
- PMCID: PMC3507658
- DOI: 10.1186/1471-2105-13-146
Identifying mutation regions for closely related individuals without a known pedigree
Abstract
Background: Linkage analysis is the first step in the search for a disease gene. Linkage studies have facilitated the identification of several hundred human genes that can harbor mutations leading to a disease phenotype. In this paper, we study a very important case, where the sampled individuals are closely related, but the pedigree is not given. This situation happens very often when the individuals share a common ancestor 6 or more generations ago. To our knowledge, no algorithm can give good results for this case.
Results: To solve this problem, we first developed some heuristic algorithms for haplotype inference without any given pedigree. We propose a model using the parsimony principle that can be viewed as an extension of the model first proposed by Dan Gusfield. Our heuristic algorithm uses Clark's inference rule to infer haplotype segments.
Conclusions: We ran our program both on the simulated data and a set of real data from the phase II HapMap database. Experiments show that our program performs well. The recall value is from 90% to 99% in various cases. This implies that the program can report more than 90% of the true mutation regions. The value of precision varies from 29% to 90%. When the precision is 29%, the size of the reported regions is three times that of the true mutation region. This is still very useful for narrowing down the range of the disease gene location. Our program can complete the computation for all the tested cases, where there are about 110,000 SNPs on a chromosome, within 20 seconds.
Figures







Similar articles
-
Mutation region detection for closely related individuals without a known pedigree using high-density genotype data.IEEE/ACM Trans Comput Biol Bioinform. 2012;9(2):499-510. doi: 10.1109/TCBB.2011.134. Epub 2011 Oct 17. IEEE/ACM Trans Comput Biol Bioinform. 2012. PMID: 22025760
-
CollHaps: a heuristic approach to haplotype inference by parsimony.IEEE/ACM Trans Comput Biol Bioinform. 2010 Jul-Sep;7(3):511-23. doi: 10.1109/TCBB.2008.130. IEEE/ACM Trans Comput Biol Bioinform. 2010. PMID: 20671321
-
HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination.Bioinformatics. 2005 Jan 1;21(1):90-103. doi: 10.1093/bioinformatics/bth388. Epub 2004 Jul 1. Bioinformatics. 2005. PMID: 15231536
-
Linked region detection using high-density SNP genotype data via the minimum recombinant model of pedigree haplotype inference.BMC Bioinformatics. 2009 Jul 15;10:216. doi: 10.1186/1471-2105-10-216. BMC Bioinformatics. 2009. PMID: 19604391 Free PMC article.
-
Using familial information for variant filtering in high-throughput sequencing studies.Hum Genet. 2014 Nov;133(11):1331-41. doi: 10.1007/s00439-014-1479-4. Epub 2014 Aug 17. Hum Genet. 2014. PMID: 25129038 Free PMC article. Review.
References
-
- Sellick G, Longman C, Tolmie J, Newbury-Ecob R, Geenhalgh L, Hughes S, Whiteford M, Garrett C, Houlston R. Genomewide linkage searches for Mendelian disease loci can be efficiently conducted using high-density SNP genotyping arrays. Nucleic Acids Res. 2004;32(20):e164. doi: 10.1093/nar/gnh163. - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous