Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets
- PMID: 22143225
- PMCID: PMC3325408
- DOI: 10.1007/s00439-011-1118-2
Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets
Abstract
Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (M(e)) for the adjustment of multiple testing, but current methods of calculation for M(e) are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate M(e). Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the M(e), and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10(-7) as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10(-8) for current or merged commercial genotyping arrays, ~10(-8) for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10(-8) for the common SNPs only within genes.
Figures

Similar articles
-
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27. BMC Genet. 2009. PMID: 19531258 Free PMC article.
-
Comprehensive evaluation of imputation performance in African Americans.J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31. J Hum Genet. 2012. PMID: 22648186 Free PMC article.
-
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.Hum Genet. 2013 May;132(5):509-22. doi: 10.1007/s00439-013-1266-7. Epub 2013 Jan 22. Hum Genet. 2013. PMID: 23334152 Free PMC article.
-
The extent of linkage disequilibrium and computational challenges of single nucleotide polymorphisms in genome-wide association studies.Curr Drug Metab. 2011 Jun;12(5):498-506. doi: 10.2174/138920011795495312. Curr Drug Metab. 2011. PMID: 21453276 Review.
-
Genotype Imputation in Genome-Wide Association Studies.Curr Protoc Hum Genet. 2019 Jun;102(1):e84. doi: 10.1002/cphg.84. Curr Protoc Hum Genet. 2019. PMID: 31216114 Review.
Cited by
-
Whole genome sequencing reveals epistasis effects within RET for Hirschsprung disease.Sci Rep. 2022 Nov 28;12(1):20423. doi: 10.1038/s41598-022-24077-w. Sci Rep. 2022. PMID: 36443333 Free PMC article.
-
Synonymous variants associated with Alzheimer disease in multiplex families.Neurol Genet. 2020 Jun 8;6(4):e450. doi: 10.1212/NXG.0000000000000450. eCollection 2020 Aug. Neurol Genet. 2020. PMID: 32637632 Free PMC article.
-
A Novel Tobacco Use Phenotype Suggests the 15q25 and 19q13 Loci May be Differentially Associated With Cigarettes per Day and Tobacco-Related Problems.Nicotine Tob Res. 2017 Apr 1;19(4):426-434. doi: 10.1093/ntr/ntw260. Nicotine Tob Res. 2017. PMID: 27663783 Free PMC article.
-
The TaSnRK1-TabHLH489 module integrates brassinosteroid and sugar signalling to regulate the grain length in bread wheat.Plant Biotechnol J. 2024 Jul;22(7):1989-2006. doi: 10.1111/pbi.14319. Epub 2024 Feb 27. Plant Biotechnol J. 2024. PMID: 38412139 Free PMC article.
-
Genetic analysis of endometriosis and depression identifies shared loci and implicates causal links with gastric mucosa abnormality.Hum Genet. 2021 Mar;140(3):529-552. doi: 10.1007/s00439-020-02223-6. Epub 2020 Sep 21. Hum Genet. 2021. PMID: 32959083
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials