Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Dec;20(12):1719-29.
doi: 10.1101/gr.110601.110. Epub 2010 Nov 2.

Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation

Affiliations

Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation

Mark D Robinson et al. Genome Res. 2010 Dec.

Erratum in

  • Genome Res. 2011 Jan;21(1):146

Abstract

DNA methylation is an essential epigenetic modification that plays a key role associated with the regulation of gene expression during differentiation, but in disease states such as cancer, the DNA methylation landscape is often deregulated. There are now numerous technologies available to interrogate the DNA methylation status of CpG sites in a targeted or genome-wide fashion, but each method, due to intrinsic biases, potentially interrogates different fractions of the genome. In this study, we compare the affinity-purification of methylated DNA between two popular genome-wide techniques, methylated DNA immunoprecipitation (MeDIP) and methyl-CpG binding domain-based capture (MBDCap), and show that each technique operates in a different domain of the CpG density landscape. We explored the effect of whole-genome amplification and illustrate that it can reduce sensitivity for detecting DNA methylation in GC-rich regions of the genome. By using MBDCap, we compare and contrast microarray- and sequencing-based readouts and highlight the impact that copy number variation (CNV) can make in differential comparisons of methylomes. These studies reveal that the analysis of DNA methylation data and genome coverage is highly dependent on the method employed, and consideration must be made in light of the GC content, the extent of DNA amplification, and the copy number.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) Schematic showing the capture of methylated DNA into populations of single-stranded (MeDIP) or double-stranded (MBDCap) fragments. (B) Summarized probe intensities for enrichment of fully methylated DNA with MeDIP and two variations of MethylMiner-based enrichment. X-axis shows the local CpG density group (1–50). Y-axis shows the log2-scale input-normalized intensity. Each line shows the median intensity for the input-normalized intensities for the probes in the bin (here, for probes with GC content of 11 only). The intensities are further normalized such that the median in the lowest bin is 0. The location of probes within CpG islands is shown by the gray-shaded region, corresponding to a local CpG density score between 12 and 40. (C) Summarized read counts in bins of 1000 bases over the same genomic regions interrogated by the Affymetrix Promoter 1.0R array. Each line represents the median log2 read count (RCpM indicates read counts per million mapped); the summaries are normalized such that the median with the lowest bin is 0.
Figure 2.
Figure 2.
Box-and-whisker plots of unnormalized log2-scale microarray intensities for unamplified and WGA-amplified genomic DNA. To control for the association between probe GC content and intensity, probes with GC content of 8, 11, and 14 (out of 25) are shown in AC, respectively. Plots for the remaining probe GC contents (and further experimental samples) are shown in Supplemental Figure 5. Probes are grouped into 50 equally sized bins genome-wide-based on their local CpG density, as shown in Figure 1, B and C. Box-and-whisker plots show the 25th and 75th percentile as the bottom and top of the box, and the band represents the median; the whiskers show the lowest data point within 1.5 interquartile range (IQR) of the 25th percentile and the highest data point within 1.5 IQR of the 75th percentile.
Figure 3.
Figure 3.
Observed cumulative bias of various amplification methods. X-axis denotes the probe GC content. Y-axis denotes the cumulative bias score, which captures the cumulative signal attenuation over the 50 bins of local CpG density (for definition, see Methods). Each line represents a different amplification strategy.
Figure 4.
Figure 4.
Box-and-whisker plots of CpG density for putative DMRs (at estimated false discovery rate of 5%) between LNCaP and PrEC cells. Shown are hypermethylated (A) and hypomethylated (B) regions.
Figure 5.
Figure 5.
Comparison of MBD-SF tiling array and sequencing data. (A) Differential methylation Z-scores between LNCaP and PrEC cells using MBD-SF-seq (y-axis) and MBD-SF-chip (x-axis). The six validated genes that are shown in Supplemental Figures 3A and 4 are indicated with black dots. The remaining dot colors are chosen according to the differential methylation concordance between MBD-SF-seq and MBD-SF-chip Z-score as depicted in B. Note that some truly differentially methylated promoters, such as WNT2, are deemed “Indeterminate” by this concordance classification. (B) Box-and-whisker plots of CpG density for concordant and discordant differentially methylated promoters, with colors corresponding to the cutoffs shown in A. (C) Box-and-whisker plots of sequencing mapability of the concordant and discordant differentially methylated promoters, using the colors from A.
Figure 6.
Figure 6.
Using promoter tiling arrays to estimate changes in copy number. (A) Y-axis is the difference in copy number between the prostate cancer and normal epithelial cell line using the Affymetrix Promoter 1.0R array along human chromosome 5. The gray line represents kernel-smoothed differences over 200 kb. (B) Y-axis shows the difference in copy number using the Affymetrix SNP 6.0 array along the same region of chromosome 5. The gray line represents kernel-smoothed differences over 50 kb. (C) X-axis and y-axis represent the smoothed copy number changes between the prostate cancer and epithelial cell lines for the Promoter 1.0R and SNP 6.0 arrays, respectively, genome-wide over a common set of loci.
Figure 7.
Figure 7.
Effects of copy number changes on differential methylation detection. (A) Differential methylation Z-score for between LNCaP and PrEC cells, using MBD-SF-seq, for human chromosome 13. (B) Smoothed Affymetrix SNP 6.0 array data showing corresponding changes in copy number. (C) Genome-wide distributions of Z-scores, stratified by the change-in-copy-number status of the corresponding regions.

References

    1. Bengtsson H, Simpson K, Bullard J, Hansen K 2008. aroma.affymetrix: A generic framework in R for analyzing small to very large Affymetrix data sets in bounded memory, tech report no. 745. Department of Statistics, University of California, Berkeley
    1. Bredel M, Bredel C, Juric D, Kim Y, Vogel H, Harsh GR, Recht LD, Pollack JR, Sikic BI 2005. Amplification of whole tumor genomes and gene-by-gene mapping of genomic aberrations from limited sources of fresh-frozen and paraffin-embedded DNA. J Mol Diagn 7: 171–182 - PMC - PubMed
    1. Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H 2009. Continuous base identification for single-molecule nanopore DNA sequencing. Nat Nanotechnol 4: 265–270 - PubMed
    1. Coolen MW, Stirzaker C, Song JZ, Statham AL, Kassir Z, Moreno CS, Young AN, Varma V, Speed TP, Cowley M, et al. 2010. Consolidation of the cancer genome into domains of repressive chromatin by long-range epigenetic silencing (LRES) reduces transcriptional plasticity. Nat Cell Biol 12: 235–246 - PMC - PubMed
    1. Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J, Turner SW 2010. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods 7: 461–465 - PMC - PubMed

Publication types

Associated data