Genome-wide association studies: quality control and population-based measures
- PMID: 19924716
- PMCID: PMC2996103
- DOI: 10.1002/gepi.20472
Genome-wide association studies: quality control and population-based measures
Abstract
Genome-wide association studies, using hundreds of thousands of single-nucleotide polymorphism (SNP) markers, have become a standard approach for identifying disease susceptibility genes. The change in the technology poses substantial computational and statistical challenges that have been addressed in the quality control, imputation, and population-based measure groups of the Genetic Analysis Workshop 16. The computational challenges pertain to efficient memory management and computational speed of the statistical procedures, and we discuss an approach for efficient SNP storage. Accuracy and computational speed is relevant for genotype calling, and the results from a comparison of three calling algorithms are discussed. The first statistical challenge is related to statistical quality control, and we discuss two novel quality control procedures. These low-level analyses have an effect on subsequent preparatory steps for high-level analyses, e.g., the quality of genotype imputation approaches. After the conduct of a genome-wide association study with successful replication and/or validation, measures of diagnostic accuracy, including the area under the curve, are investigated. The area under the curve can be constructed from summary data in some situations. Finally, we discuss how the population-attributable risk of a genetic variant that is only measured in a reference data set can be determined.
(c) 2009 Wiley-Liss, Inc.
Figures
Similar articles
-
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27. BMC Genet. 2009. PMID: 19531258 Free PMC article.
-
A simple and fast two-locus quality control test to detect false positives due to batch effects in genome-wide association studies.Genet Epidemiol. 2010 Dec;34(8):854-62. doi: 10.1002/gepi.20541. Genet Epidemiol. 2010. PMID: 21104888 Free PMC article.
-
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2. BMC Genomics. 2015. PMID: 26486989 Free PMC article.
-
Missing data imputation and haplotype phase inference for genome-wide association studies.Hum Genet. 2008 Dec;124(5):439-50. doi: 10.1007/s00439-008-0568-7. Epub 2008 Oct 11. Hum Genet. 2008. PMID: 18850115 Free PMC article. Review.
-
Genotype Imputation from Large Reference Panels.Annu Rev Genomics Hum Genet. 2018 Aug 31;19:73-96. doi: 10.1146/annurev-genom-083117-021602. Epub 2018 May 23. Annu Rev Genomics Hum Genet. 2018. PMID: 29799802 Review.
Cited by
-
Gene Influence in the Effectiveness of Plant Sterols Treatment in Children: Pilot Interventional Study.Nutrients. 2019 Oct 21;11(10):2538. doi: 10.3390/nu11102538. Nutrients. 2019. PMID: 31640222 Free PMC article.
-
The role of a FADS1 polymorphism in the association of fatty acid blood levels, BMI and blood pressure in young children-Analyses based on path models.PLoS One. 2017 Jul 21;12(7):e0181485. doi: 10.1371/journal.pone.0181485. eCollection 2017. PLoS One. 2017. PMID: 28732058 Free PMC article.
-
Linkage disequilibrium and inbreeding estimation in Spanish Churra sheep.BMC Genet. 2012 Jun 12;13:43. doi: 10.1186/1471-2156-13-43. BMC Genet. 2012. PMID: 22691044 Free PMC article.
-
X chromosome genetic data in a Spanish children cohort, dataset description and analysis pipeline.Sci Data. 2019 Jul 22;6(1):130. doi: 10.1038/s41597-019-0109-3. Sci Data. 2019. PMID: 31332195 Free PMC article.
-
Comparison of pre-processing methods for multiplex bead-based immunoassays.BMC Genomics. 2016 Aug 11;17(1):601. doi: 10.1186/s12864-016-2888-7. BMC Genomics. 2016. PMID: 27515389 Free PMC article.
References
-
- Affymetrix. BRLMM: An improved genotype calling method for the GeneChip® Mapping 500K Array Set. Santa Clara, CA: Affymetrix; 2007.
-
- Affymetrix. Affymetrix® Genotyping Console 3.0 user manual. Santa Clara, CA: Affymetrix; 2008.
-
- Aulchenko YS, Ripke S, Isaacs A, van Duijn CM. GenABEL: An R library for genome-wide association analysis. Bioinformatics. 2007;23:1294–1296. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources