Direct analysis of unphased SNP genotype data in population-based association studies via Bayesian partition modelling of haplotypes
- PMID: 15940704
- DOI: 10.1002/gepi.20080
Direct analysis of unphased SNP genotype data in population-based association studies via Bayesian partition modelling of haplotypes
Abstract
We describe a novel method for assessing the strength of disease association with single nucleotide polymorphisms (SNPs) in a candidate gene or small candidate region, and for estimating the corresponding haplotype relative risks of disease, using unphased genotype data directly. We begin by estimating the relative frequencies of haplotypes consistent with observed SNP genotypes. Under the Bayesian partition model, we specify cluster centres from this set of consistent SNP haplotypes. The remaining haplotypes are then assigned to the cluster with the "nearest" centre, where distance is defined in terms of SNP allele matches. Within a logistic regression modelling framework, each haplotype within a cluster is assigned the same disease risk, reducing the number of parameters required. Uncertainty in phase assignment is addressed by considering all possible haplotype configurations consistent with each unphased genotype, weighted in the logistic regression likelihood by their probabilities, calculated according to the estimated relative haplotype frequencies. We develop a Markov chain Monte Carlo algorithm to sample over the space of haplotype clusters and corresponding disease risks, allowing for covariates that might include environmental risk factors or polygenic effects. Application of the algorithm to SNP genotype data in an 890-kb region flanking the CYP2D6 gene illustrates that we can identify clusters of haplotypes with similar risk of poor drug metaboliser (PDM) phenotype, and can distinguish PDM cases carrying different high-risk variants. Further, the results of a detailed simulation study suggest that we can identify positive evidence of association for moderate relative disease risks with a sample of 1,000 cases and 1,000 controls.
Similar articles
-
Accounting for haplotype uncertainty in matched association studies: a comparison of simple and flexible techniques.Genet Epidemiol. 2005 Apr;28(3):261-72. doi: 10.1002/gepi.20061. Genet Epidemiol. 2005. PMID: 15637718
-
Fine mapping of disease genes via haplotype clustering.Genet Epidemiol. 2006 Feb;30(2):170-9. doi: 10.1002/gepi.20134. Genet Epidemiol. 2006. PMID: 16385468
-
Linkage disequilibrium assessment via log-linear modeling of SNP haplotype frequencies.Genet Epidemiol. 2003 Sep;25(2):106-14. doi: 10.1002/gepi.10254. Genet Epidemiol. 2003. PMID: 12916019
-
Tag SNP selection for association studies.Genet Epidemiol. 2004 Dec;27(4):365-74. doi: 10.1002/gepi.20028. Genet Epidemiol. 2004. PMID: 15372618 Review.
-
[Analysis and application of SNP and haplotype in the human genome].Yi Chuan Xue Bao. 2005 Aug;32(8):879-89. Yi Chuan Xue Bao. 2005. PMID: 16231744 Review. Chinese.
Cited by
-
High-density SNP association study and copy number variation analysis of the AUTS1 and AUTS5 loci implicate the IMMP2L-DOCK4 gene region in autism susceptibility.Mol Psychiatry. 2010 Sep;15(9):954-68. doi: 10.1038/mp.2009.34. Epub 2009 Apr 28. Mol Psychiatry. 2010. PMID: 19401682 Free PMC article.
-
A flexible Bayesian framework for modeling haplotype association with disease, allowing for dominance effects of the underlying causative variants.Am J Hum Genet. 2006 Oct;79(4):679-94. doi: 10.1086/508264. Epub 2006 Aug 31. Am J Hum Genet. 2006. PMID: 16960804 Free PMC article.
-
Association mapping by generalized linear regression with density-based haplotype clustering.Genet Epidemiol. 2009 Jan;33(1):16-26. doi: 10.1002/gepi.20352. Genet Epidemiol. 2009. PMID: 18561202 Free PMC article.
-
Genetic association mapping via evolution-based clustering of haplotypes.PLoS Genet. 2007 Jul;3(7):e111. doi: 10.1371/journal.pgen.0030111. PLoS Genet. 2007. PMID: 17616979 Free PMC article.
-
Detailed investigation of the role of common and low-frequency WFS1 variants in type 2 diabetes risk.Diabetes. 2010 Mar;59(3):741-6. doi: 10.2337/db09-0920. Epub 2009 Dec 22. Diabetes. 2010. PMID: 20028947 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources