Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Mar 3;6(1):4.
doi: 10.1186/1756-8935-6-4.

Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array

Affiliations

Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array

Magda E Price et al. Epigenetics Chromatin. .

Abstract

Background: Measurement of genome-wide DNA methylation (DNAm) has become an important avenue for investigating potential physiologically-relevant epigenetic changes. Illumina Infinium (Illumina, San Diego, CA, USA) is a commercially available microarray suite used to measure DNAm at many sites throughout the genome. However, it has been suggested that a subset of array probes may give misleading results due to issues related to probe design. To facilitate biologically significant data interpretation, we set out to enhance probe annotation of the newest Infinium array, the HumanMethylation450 BeadChip (450 k), with >485,000 probes covering 99% of Reference Sequence (RefSeq) genes (National Center for Biotechnology Information (NCBI), Bethesda, MD, USA). Annotation that was added or expanded on includes: 1) documented SNPs in the probe target, 2) probe binding specificity, 3) CpG classification of target sites and 4) gene feature classification of target sites.

Results: Probes with documented SNPs at the target CpG (4.3% of probes) were associated with increased within-tissue variation in DNAm. An example of a probe with a SNP at the target CpG demonstrated how sample genotype can confound the measurement of DNAm. Additionally, 8.6% of probes mapped to multiple locations in silico. Measurements from these non-specific probes likely represent a combination of DNAm from multiple genomic sites. The expanded biological annotation demonstrated that based on DNAm, grouping probes by an alternative high-density and intermediate-density CpG island classification provided a distinctive pattern of DNAm. Finally, variable enrichment for differentially methylated probes was noted across CpG classes and gene feature groups, dependant on the tissues that were compared.

Conclusion: DNAm arrays offer a high-throughput approach for which careful consideration of probe content should be utilized to better understand the biological processes affected. Probes containing SNPs and non-specific probes may affect the assessment of DNAm using the 450 k array. Additionally, probe classification by CpG enrichment classes and to a lesser extent gene feature groups resulted in distinct patterns of DNAm. Thus, we recommend that compromised probes be removed from analyses and that the genomic context of DNAm is considered in studies deciphering the biological meaning of Illumina 450 k array data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Probes targeting polymorphic CpGs may affect the assessment of DNAm. (A) A documented SNP was identified at the target C or G position of 4.3% of 450 k probes (target CpG SNP). Of these SNPs, 43.2% had a heterozygosity of >0.1 and due to their frequency in the population are more likely to affect measurement of DNAm. (B) Using blood samples (n = 4) as example, the SD in ß value between individuals was calculated for all probes. Probes with small SD in ß (<0.10) were removed from the analysis. The distribution of SD in ß value was plotted for all probes, and for the subsets of probes annotated with a target CpG SNP, a SNP within 10 bps of the target but without a target CpG SNP (SNP <10 bp) and a SNP within the remainder of the probe (SNP >10 bp). Numbers in brackets indicate Kolmogorov-Smirnov (KS) statistics in comparison to the distribution of all probes. (C) Using a selection of 261 adult blood samples extracted from the aging dataset [GSE:40279], the distribution of SD in ß value was plotted for the subsets of probes as described in (B). Numbers in brackets indicate KS statistics in comparison to the distribution of all probes. (D) DNAm at probe cg06961873 across 12 individuals exemplified the trichotomous pattern of DNAm hypothesized at a target CpG SNP. The three distinct levels of DNAm corresponded to sample genotype at SNP rs61775206, located at the target CpG: TT genotypes were assessed as hypomethylated, TC genotypes as approximately 50% methylated and CC genotypes as close to fully methylated. 450 k, Infinium HumanMethylation450 BeadChip; DNAm, DNA methylation.
Figure 2
Figure 2
Comparison of the genomic distribution of Illumina-annotated CpG probe classes within each HIL-annotated CpG probe class. Within HCs, ICshores and LCs, the majority of probes were categorized into the respective Illumina-annotated CpG class. However, even though ICs and ICshores have the same CpG density, the distribution of probes based on Illumina CpG classes was different between these two HIL classes, suggesting a functional difference between ICs that border HCs and isolated ICs. HC, high-density CpG island; HIL, high-density CpG island (HC), intermediate-density CpG island (IC) and non-island (LC); ICshore, intermediate-density CpG island shore.
Figure 3
Figure 3
Distinct patterns of DNAm within CpG classification systems. Probes were grouped into three levels of DNAm based on average ß values within a tissue: hypomethylated (ß values of 0 to ≤0.2, yellow), heterogeneously methylated (ß values of >0.2 to <0.8, light blue) and hypermethylated (ß values of ≥0.8 to 1, dark blue). The percentage of probes in Illumina and HIL-annotated CpG classes was plotted for the three levels of DNAm in blood (n = 4). HIL CpG classes were more characteristic in their DNAm profiles than Illumina-annotated CpG classes. Numbers on top of bars indicate number of probes/class. DNAm, DNA methylation; HIL, high-density CpG island (HC), intermediate-density CpG island (IC) and non-island (LC); ICshore, intermediate-density CpG island shore.
Figure 4
Figure 4
Enrichment of differentially methylated probes in many CpG classes. Probes that were differentially methylated between blood and buccal samples (n = 69,174), or between blood and chorionic villus samples (n = 91,255), were assessed for enrichment in (A) Illumina and (B) HIL-annotated CpG classes. Enrichment was plotted as ‘percentage relative enrichment’, representing the enrichment of tDM probes relative to the total percentage of probes within each CpG class. Negative percentage relative enrichment indicates that tDM probes were depleted in the given probe-type category whereas positive percentage relative enrichment indicates that tDM probes were enriched in the given probe-type category. All enrichment analyses were significant with the exception of ICshore probes in the comparison of blood versus chorionic villi. HIL, high-density CpG island (HC), intermediate-density CpG island (IC) and non-island (LC); ICshore, intermediate-density CpG island shore; tDM, differentially methylated between tissues.
Figure 5
Figure 5
Illustration of gene feature annotation. Based on the overlap of three gene components (first exon vs exon vs intron) with three gene regions (5’UTR vs body vs 3’UTR) probes were annotated into the following nine gene feature groups: 1) 5’UTR first exons, 2) 5’UTR exons, 3) 5’UTR introns, 4) body first exons, 5) body exons, 6) body introns, 7) 3’UTR first exons, 8) 3’UTR exons and 9) 3’UTR introns (corresponding to numbers below transcripts). A given probe could be annotated with more than one gene feature, as illustrated by the multiple transcripts (A to E) of a fictional gene. Probe i would be annotated as 5’UTR exon, 5’UTR first exon and 5’UTR intron; probe ii would be annotated as body exon, 5’UTR intron, body intron, body first exon; and probe iii would be annotated as 3’UTR exon, 3’UTR intron, 3’UTR exon, 3’UTR exon and 3’UTR first exon. White boxes represent untranslated exons, grey boxes represent translated exons. 5’UTR, 5’ untranslated region; 3’UTR, 3’ untranslated region.
Figure 6
Figure 6
Contribution of HIL CpG classes to probes in nine gene feature groups. The percentage of probes within each HIL CpG class was different for each gene feature group. Numbers on top of bars indicate the number of probes/gene feature group; a total of 213,315 probes were located within these nine gene feature groups. HIL, high-density CpG island (HC), intermediate-density CpG island (IC) and non-island (LC); ICshore, intermediate-density CpG island shore.
Figure 7
Figure 7
Variation of gene feature DNAm within a CpG class. The level of DNAm was plotted as an average ß value for each gene feature in blood. Analyses were conducted within each HIL CpG class due to the large differences in DNAm that were observed between classes. Average ß values varied across probes by (A) gene location, as exemplified by intronic probes and (B) gene components, as exemplified by 5’UTR probes. 5’UTR, 5’ untranslated region; DNAm, DNA methylation; HIL, high-density CpG island (HC), intermediate-density CpG island (IC) and non-island (LC); ICshore, intermediate-density CpG island shore.

References

    1. Dempster EL, Pidsley R, Schalkwyk LC, Owens S, Georgiades A, Kane F, Kalidindi S, Picchioni M, Kravariti E, Toulopoulou T, Murray RM, Mill J. Disease-associated epigenetic changes in monozygotic twins discordant for schizophrenia and bipolar disorder. Hum Mol Genet. 2011;20(24):4786–4796. doi: 10.1093/hmg/ddr416. - DOI - PMC - PubMed
    1. Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP. Increased methylation variation in epigenetic domains across cancer types. Nat Genet. 2011;43(8):768–775. doi: 10.1038/ng.865. - DOI - PMC - PubMed
    1. Aston KI, Punj V, Liu L, Carrell DT. Genome-wide sperm deoxyribonucleic acid methylation is altered in some men with abnormal chromatin packaging or poor in vitro fertilization embryogenesis. Fertil Steril. 2012;97(2):285–292. doi: 10.1016/j.fertnstert.2011.11.008. - DOI - PubMed
    1. Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet. 2011;12(8):529–541. doi: 10.1038/nrg3000. - DOI - PMC - PubMed
    1. Heijmans BT, Mill J. Commentary: The seven plagues of epigenetic epidemiology. Int J Epidemiol. 2012;41(1):74–78. doi: 10.1093/ije/dyr225. - DOI - PMC - PubMed