Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 22;15(1):51.
doi: 10.1186/1471-2164-15-51.

Reducing the risk of false discovery enabling identification of biologically significant genome-wide methylation status using the HumanMethylation450 array

Affiliations

Reducing the risk of false discovery enabling identification of biologically significant genome-wide methylation status using the HumanMethylation450 array

Haroon Naeem et al. BMC Genomics. .

Abstract

Background: The Illumina HumanMethylation450 BeadChip (HM450K) measures the DNA methylation of 485,512 CpGs in the human genome. The technology relies on hybridization of genomic fragments to probes on the chip. However, certain genomic factors may compromise the ability to measure methylation using the array such as single nucleotide polymorphisms (SNPs), small insertions and deletions (INDELs), repetitive DNA, and regions with reduced genomic complexity. Currently, there is no clear method or pipeline for determining which of the probes on the HM450K bead array should be retained for subsequent analysis in light of these issues.

Results: We comprehensively assessed the effects of SNPs, INDELs, repeats and bisulfite induced reduced genomic complexity by comparing HM450K bead array results with whole genome bisulfite sequencing. We determined which CpG probes provided accurate or noisy signals. From this, we derived a set of high-quality probes that provide unadulterated measurements of DNA methylation.

Conclusions: Our method significantly reduces the risk of false discoveries when using the HM450K bead array, while maximising the power of the array to detect methylation status genome-wide. Additionally, we demonstrate the utility of our method through extraction of biologically relevant epigenetic changes in prostate cancer.

PubMed Disclaimer

Figures

Figure 1
Figure 1
This figure summarises the comparison between WGBS and HM450K beta-values for the H1-hESC cell line. Figures a) and b) show contour plots demonstrating the correlation of DNA methylation between WGBS and HM450K bead array for Infinium I probes and Infinium II probes respectively. The contours (different colour intensities) capture the density of beta-values. It can be seen that most points reside close to 0, 0 and 1, 1, resulting in high-correlation between the platforms. Figures c) and d) show boxplots of DNA methylation results (absolute beta differences) between WGBS and HM450K bead array, plotted for different potential filtering categories for Infinium I and II probes respectively. The box extends from the first to the third quartile and whiskers extend to 1.5 times the interquartile range. Points outside this are considered outliers. The blue boxplots show the distribution of filtering category probes that were statistically significantly different from the high quality probes (golden boxplot) (P < 0.001) otherwise, the category is plotted as light green. The red dotted line depicts the median of a high quality probe set. Category definitions: High-quality - represents probes which are not affected by any genomic factors; Repeats - describes probes which hybridize to repetitive regions; Bis-okay – are probes which hybridize regions containing any C– > T SNP or T- > C SNP and are ‘okay’ in bisulfite space; SNP-at-CpG-C and SNP-at-CpG-G - are probes which have SNPs at the interrogated C and its neighbouring G position, respectively; Indels - are probes which hybridize regions containing INDELs; Multimap - are probes which hybridize to multiple genomic loci; SNP-1 - are probes which contain only a single SNP anywhere in the body; and SNP > = 2 - are probes which contain at least 2 SNPs anywhere in the probe body.
Figure 2
Figure 2
This figure shows a histogram of the number of probes with SNPs at the interrogated CpG (y-axis) and their beta-value (x-axis). In each case, these probes have been shown to have a SNP which causes a mis-match in the probe sequence at the C of the interrogated CpG (a) and at the G of the interrogated CpG (b). These plots were generated using HM450K data from the H1-hESC cell line.
Figure 3
Figure 3
This figure represents gene interaction networks from the string protein-protein interaction database. The networks were derived from 3 sets of differentially methylated gene promoters: (a) the genes uniquely identified with no probe filtering, (b) with conservative probe filtering, and (c) our recommended probe filtering procedure. Dark blue represents unmethylated gene promoter in prostate cancer and red represents methylated gene promoters in prostate cancer.
Figure 4
Figure 4
This figure shows the distribution of standard deviation in beta values for probes on HM450K bead array, using (a) 4 blood samples and (b) 261 blood samples. In each case, the distribution of standard deviation in beta values (> = 0.10) was plotted for all probes, for probes kept for subsequent analysis (Keep probes), and for the recommended removal of probes (Discard probes).

References

    1. Laurent L, Wong E, Li G, Huynh T, Tsirigos A, Ong CT, Low HM, Kin Sung KW, Rigoutsos I, Loring J, Wei C–L. Dynamic changes in the human methylome during differentiation. Genome Res. 2010;20:320–331. doi: 10.1101/gr.101907.109. - DOI - PMC - PubMed
    1. Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo Q-M, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–322. doi: 10.1038/nature08514. - DOI - PMC - PubMed
    1. Deaton AM, Bird A. CpG islands and the regulation of transcription. Genes Dev. 2011;25:1010–1022. doi: 10.1101/gad.2037511. - DOI - PMC - PubMed
    1. Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13:484–492. doi: 10.1038/nrg3230. - DOI - PubMed
    1. Phillips T. The role of methylation in gene expression. Nat Educ. 2008;1(1):116.

Publication types