Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;33 Suppl 1(Suppl 1):S45-50.
doi: 10.1002/gepi.20472.

Genome-wide association studies: quality control and population-based measures

Affiliations

Genome-wide association studies: quality control and population-based measures

Andreas Ziegler. Genet Epidemiol. 2009.

Abstract

Genome-wide association studies, using hundreds of thousands of single-nucleotide polymorphism (SNP) markers, have become a standard approach for identifying disease susceptibility genes. The change in the technology poses substantial computational and statistical challenges that have been addressed in the quality control, imputation, and population-based measure groups of the Genetic Analysis Workshop 16. The computational challenges pertain to efficient memory management and computational speed of the statistical procedures, and we discuss an approach for efficient SNP storage. Accuracy and computational speed is relevant for genotype calling, and the results from a comparison of three calling algorithms are discussed. The first statistical challenge is related to statistical quality control, and we discuss two novel quality control procedures. These low-level analyses have an effect on subsequent preparatory steps for high-level analyses, e.g., the quality of genotype imputation approaches. After the conduct of a genome-wide association study with successful replication and/or validation, measures of diagnostic accuracy, including the area under the curve, are investigated. The area under the curve can be constructed from summary data in some situations. Finally, we discuss how the population-attributable risk of a genetic variant that is only measured in a reference data set can be determined.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Succession of design, experimental and data analysis steps in a genome-wide association and subsequent studies. Adapted from Ziegler et al. [2008].

Similar articles

Cited by

References

    1. Affymetrix. BRLMM: An improved genotype calling method for the GeneChip® Mapping 500K Array Set. Santa Clara, CA: Affymetrix; 2007.
    1. Affymetrix. Affymetrix® Genotyping Console 3.0 user manual. Santa Clara, CA: Affymetrix; 2008.
    1. Amos CI, Chen WV, Seldin MF, Remmers E, Taylor KE, Criswell LA, Lee AT, Plenge RM, Kastner DL, Gregersen PK. Data for Genetic Analysis Workshop 16 Problem 1, association analysis of rheumatoid arthritis data. BMC Proc. 2009;3 Suppl 7:S2. - PMC - PubMed
    1. Aulchenko YS, Ripke S, Isaacs A, van Duijn CM. GenABEL: An R library for genome-wide association analysis. Bioinformatics. 2007;23:1294–1296. - PubMed
    1. Chen X, Zhang M, Wang M, Zhu W, Cho K, Zhang H. Memory management in genome-wide association studies. BMC Proc. 2009;3 Suppl 7:S54. - PMC - PubMed

Publication types

LinkOut - more resources