Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Apr;38(3):231-41.
doi: 10.1002/gepi.21789. Epub 2014 Jan 29.

Accounting for population stratification in DNA methylation studies

Affiliations

Accounting for population stratification in DNA methylation studies

Richard T Barfield et al. Genet Epidemiol. 2014 Apr.

Abstract

DNA methylation is an important epigenetic mechanism that has been linked to complex diseases and is of great interest to researchers as a potential link between genome, environment, and disease. As the scale of DNA methylation association studies approaches that of genome-wide association studies, issues such as population stratification will need to be addressed. It is well-documented that failure to adjust for population stratification can lead to false positives in genetic association studies, but population stratification is often unaccounted for in DNA methylation studies. Here, we propose several approaches to correct for population stratification using principal components (PCs) from different subsets of genome-wide methylation data. We first illustrate the potential for confounding due to population stratification by demonstrating widespread associations between DNA methylation and race in 388 individuals (365 African American and 23 Caucasian). We subsequently evaluate the performance of our PC-based approaches and other methods in adjusting for confounding due to population stratification. Our simulations show that (1) all of the methods considered are effective at removing inflation due to population stratification, and (2) maximum power can be obtained with single-nucleotide polymorphism (SNP)-based PCs, followed by methylation-based PCs, which outperform both surrogate variable analysis and genomic control. Among our different approaches to computing methylation-based PCs, we find that PCs based on CpG sites chosen for their potential to proxy nearby SNPs can provide a powerful and computationally efficient approach to adjust for population stratification in DNA methylation studies when genome-wide SNP data are unavailable.

Keywords: DNA methylation; association studies; population stratification; principal components.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Principal components by self-reported race
A) 1st and 2nd PC from PCGWAS B) 2nd and 3rd PC from PC0bp C) 4th and 6th PC from PC50bp. Red points = African American individuals; blue points = Caucasian individuals.
Figure 2
Figure 2. Replication of aging results
Replication of the top eight CpG sites associated with aging [Teschendorff, et al. 2010], using 16 different approaches to adjust for population stratification.
Figure 3
Figure 3. Replication of smoking results
Replication of the top CpG site associated with smoking in [Breitling, et al. 2012; Breitling, et al. 2011; Shenker, et al. 2013; Sun, et al. 2013; Wan, et al. 2012], using 16 different approaches to adjust for population stratification. The dotted line indicates the p-value from a previous replication based on 239 African Americans from our sample [Sun, et al. 2013].

References

    1. 1000 Genomes Project Consortium AG. Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061–73. - PMC - PubMed
    1. Adkins RM, Krushkal J, Tylavsky FA, Thomas F. Racial differences in gene-specific DNA methylation levels are present at birth. Birth Defects Res A Clin Mol Teratol. 2011;91(8):728–36. - PMC - PubMed
    1. Alisch RS, Barwick BG, Chopra P, Myrick LK, Satten GA, Conneely KN, Warren ST. Age-associated DNA methylation in pediatric populations. Genome research. 2012;22(4):623–32. - PMC - PubMed
    1. Bacanu SA, Devlin B, Roeder K. The power of genomic control. Am J Hum Genet. 2000;66(6):1933–44. - PMC - PubMed
    1. Barfield RT, Kilaru V, Smith AK, Conneely KN. CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics. 2012 - PMC - PubMed

Publication types

LinkOut - more resources