Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens
- PMID: 16685721
- DOI: 10.1002/gepi.20159
Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens
Abstract
Single nucleotide polymorphisms (SNPs) are becoming widely used as genotypic markers in genetic association studies of common, complex human diseases. For such association screens, a crucial part of study design is determining what SNPs to prioritize for genotyping. We present a novel power-based algorithm to select a subset of tag SNPs for genotyping from a map of available SNPs. Blocks of markers in strong linkage disequilibrium (LD) are identified, and SNPs are selected to represent each block such that power to detect disease association with an underlying disease allele in LD with block members is preserved; all markers outside of blocks are also included in the tagging subset. A key, novel element of this method is that it incorporates information about the phase of LD observed among marker pairs to retain markers likely to be in coupling phase with an underlying disease locus, thus increasing power compared to a phase-blind approach. Power calculations illustrate important issues regarding LD phase and make clear the advantages of our approach to SNP selection. We apply our algorithm to genotype data from the International HapMap Consortium and demonstrate that considerable reduction in SNP genotyping may be attained while retaining much of the available power for a disease association screen. We also demonstrate that these tag SNPs effectively represent underlying variants not included in the LD analysis and SNP selection, by using leave-one-out tests to show that most (approximately 90%) of the "untyped" variants lying in blocks are in coupling-phase LD with a tag SNP. Additional performance tests using the HapMap ENCyclopedia of DNA Elements (ENCODE) regions show that the method compares well with the popular r2 bin tagging method. This work is a concrete example of how empirical LD phase may be used to benefit study design.
Copyright (c) 2006 Wiley-Liss, Inc.
Similar articles
-
The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests.Hum Hered. 2006;61(1):31-44. doi: 10.1159/000092141. Epub 2006 Mar 23. Hum Hered. 2006. PMID: 16557026
-
FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium.BMC Bioinformatics. 2010 Jan 29;11:66. doi: 10.1186/1471-2105-11-66. BMC Bioinformatics. 2010. PMID: 20113476 Free PMC article.
-
ATRIUM: testing untyped SNPs in case-control association studies with related individuals.Am J Hum Genet. 2009 Nov;85(5):667-78. doi: 10.1016/j.ajhg.2009.10.006. Am J Hum Genet. 2009. PMID: 19913122 Free PMC article.
-
Tag SNP selection for association studies.Genet Epidemiol. 2004 Dec;27(4):365-74. doi: 10.1002/gepi.20028. Genet Epidemiol. 2004. PMID: 15372618 Review.
-
[Analysis and application of SNP and haplotype in the human genome].Yi Chuan Xue Bao. 2005 Aug;32(8):879-89. Yi Chuan Xue Bao. 2005. PMID: 16231744 Review. Chinese.
Cited by
-
Efficiently identifying significant associations in genome-wide association studies.J Comput Biol. 2013 Oct;20(10):817-30. doi: 10.1089/cmb.2013.0087. Epub 2013 Sep 14. J Comput Biol. 2013. PMID: 24033261 Free PMC article.
-
Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit (CHRN) genes.Am J Med Genet B Neuropsychiatr Genet. 2009 Jun 5;150B(4):453-66. doi: 10.1002/ajmg.b.30828. Am J Med Genet B Neuropsychiatr Genet. 2009. PMID: 19259974 Free PMC article.
-
Modeling activation of inflammatory response system: a molecular-genetic neural network analysis.BMC Proc. 2007;1 Suppl 1(Suppl 1):S61. doi: 10.1186/1753-6561-1-s1-s61. Epub 2007 Dec 18. BMC Proc. 2007. PMID: 18466562 Free PMC article.
-
Efficient association study design via power-optimized tag SNP selection.Ann Hum Genet. 2008 Nov;72(Pt 6):834-47. doi: 10.1111/j.1469-1809.2008.00469.x. Epub 2008 Aug 13. Ann Hum Genet. 2008. PMID: 18702637 Free PMC article.
-
Importance of SNP Dependency Correction and Association Integration for Gene Set Analysis in Genome-Wide Association Studies.Front Genet. 2021 Dec 9;12:767358. doi: 10.3389/fgene.2021.767358. eCollection 2021. Front Genet. 2021. PMID: 34956320 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials