. 2012 Jan 25:13:42.

doi: 10.1186/1471-2164-13-42.

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

Michiel E Adriaens¹, Magali Jaillard, Lars M T Eijssen, Claus-Dieter Mayer, Chris T A Evelo

Affiliations

PMID: 22276688
PMCID: PMC3293711
DOI: 10.1186/1471-2164-13-42

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

Michiel E Adriaens et al. BMC Genomics. 2012.

. 2012 Jan 25:13:42.

doi: 10.1186/1471-2164-13-42.

Authors

Michiel E Adriaens¹, Magali Jaillard, Lars M T Eijssen, Claus-Dieter Mayer, Chris T A Evelo

Affiliation

¹ Department of Bioinformatics-BiGCaT, Maastricht University, Maastricht, The Netherlands. michiel.adriaens@maastrichtuniversity.nl

PMID: 22276688
PMCID: PMC3293711
DOI: 10.1186/1471-2164-13-42

Abstract

Background: The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate.

Results: We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures.

Conclusion: T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially.

PubMed Disclaimer

Figures

**Figure 1**
**The birth of an enrichment signal around a binding site (ChIP-on-chip)**. Since DNA fragmentation through sonication can be modeled as a Poisson process [1], the DNA fragment length distribution follows a Poisson distribution and adjacent probes on the genome have a correlated log-ratio, resulting in the hybridization pattern shown here. Each blue column represents a probe hybridization site. Black-outlined bars represent their log-ratio. Green lines are sonicated immuno-precipitated DNA fragments corresponding to the binding site.

**Figure 2**
**An example of a two-component distribution fitted on ChIP-on-chip data of dataset #1 (see Methods section for dataset description and numbering)**.

**Figure 3**
Density distributions of the control probes and gene promoter probes of the raw log-ratio data of all individual microarrays and corresponding ROC curves for dataset #1 (a), dataset #2 (b), dataset #3 (c), dataset #4 (d) and dataset #5 (e). AUC values of each ROC curve are reported in the legend.

**Figure 4**
ROC curves of the control probe and gene promoter distributions of the combined log-ratio data, for each normalization approach of dataset #1 (a), dataset #2 (b), dataset #3 (c), dataset #4 (d) and dataset #5 (e). AUC values are reported in the legend. TBW = Tukey's biweight scaling, Q = quantile normalization, TQ = T-quantile normalization.

**Figure 5**
**Density distributions of the control probes and gene promoter probes of the normalized combined log-ratio data of dataset #2 (ChIP-on-chip)**. Results are shows for (from left to right, top to bottom) VSN, LOWESS, quantile (Q), T-quantile (TQ), Tukey's biweight scaling (TBW), Peng's method.

**Figure 6**
**Density distributions of the control probes and gene promoter probes of the normalized log-ratio data of each individual microarray and corresponding ROC curves of dataset #2 (ChIP-on-chip)**. Top: Results for T-quantile (TQ) normalized data. Bottom: Results for Tukey's biweight scaling (TBW) normalized data. AUC values of each ROC curve are reported in the legend.

**Figure 7**
**Density distributions for the control probes and gene promoter probes of the normalized log-ratio data of each individual microarray and corresponding ROC curves of dataset #4 and #5**. Top left: Results for T-quantile (TQ) normalized data of dataset #4. Top right: Results for Tukey's biweight scaling (TBW) normalized data of dataset #4. Bottom left: Results for T-quantile (TQ) normalized data of dataset #5. Bottom right: Results for Tukey's biweight scaling (TBW) normalized data of dataset #5. AUC values of each ROC curve are reported in the legend.

**Figure 8**
**Genome plots of negative ¹⁰log-transformed enrichment p-values, for the HOXA cluster on human chromosome 7 (top) and the Dlk1-Gtl2 cluster on mouse chromosome 12 (bottom)**. Red vertical lines are given at values corresponding to p-values of 0.05 (top line) and 0.20 (bottom line). Regions with values above the top line are highly enriched, while values between the lines are a sign of moderate enrichment. The total number of identified enriched regions are reported in the legend. TBW = Tukey's biweight scaling, Q = quantile normalization, TQ = T-quantile normalization.

See this image and copyright information in PMC

Cited by

Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains.
Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L, Xue W, Zhu F. Tang J, et al. Mol Cell Proteomics. 2019 Aug;18(8):1683-1699. doi: 10.1074/mcp.RA118.001169. Epub 2019 May 16. Mol Cell Proteomics. 2019. PMID: 31097671 Free PMC article.
Novel technologies and emerging biomarkers for personalized cancer immunotherapy.
Yuan J, Hegde PS, Clynes R, Foukas PG, Harari A, Kleen TO, Kvistborg P, Maccalli C, Maecker HT, Page DB, Robins H, Song W, Stack EC, Wang E, Whiteside TL, Zhao Y, Zwierzina H, Butterfield LH, Fox BA. Yuan J, et al. J Immunother Cancer. 2016 Jan 19;4:3. doi: 10.1186/s40425-016-0107-3. eCollection 2016. J Immunother Cancer. 2016. PMID: 26788324 Free PMC article. Review.
Gene promoter DNA methylation patterns have a limited role in orchestrating transcriptional changes in the fetal liver in response to maternal folate depletion during pregnancy.
McKay JA, Adriaens M, Evelo CT, Ford D, Mathers JC. McKay JA, et al. Mol Nutr Food Res. 2016 Sep;60(9):2031-42. doi: 10.1002/mnfr.201600079. Epub 2016 Jun 6. Mol Nutr Food Res. 2016. PMID: 27133805 Free PMC article.
Methylation Landscape of Human Breast Cancer Cells in Response to Dietary Compound Resveratrol.
Medina-Aguilar R, Pérez-Plasencia C, Marchat LA, Gariglio P, García Mena J, Rodríguez Cuevas S, Ruíz-García E, Astudillo-de la Vega H, Hernández Juárez J, Flores-Pérez A, López-Camarillo C. Medina-Aguilar R, et al. PLoS One. 2016 Jun 29;11(6):e0157866. doi: 10.1371/journal.pone.0157866. eCollection 2016. PLoS One. 2016. PMID: 27355345 Free PMC article.
Epigenetics and childhood asthma: current evidence and future research directions.
Salam MT, Zhang Y, Begum K. Salam MT, et al. Epigenomics. 2012 Aug;4(4):415-29. doi: 10.2217/epi.12.32. Epigenomics. 2012. PMID: 22920181 Free PMC article. Review.

See all "Cited by" articles

References

1. Zheng M, Barrera LO, Ren B, Wu YN. ChIP-chip: Data, Model, and Analysis. Biometrics. 2007;63(3):787–796. doi: 10.1111/j.1541-0420.2007.00768.x. - DOI - PubMed
1. Mohn F, Weber M, Schübeler D, Roloff T-C. Methylated DNA Immunoprecipitation (MeDIP) Methods Mol Biol. 2009;507:55–64. doi: 10.1007/978-1-59745-522-0_5. - DOI - PubMed
1. Ordway JM, Bedell JA, Citek RW, Nunberg A, Garrido A, Kendall R, Stevens JR, Cao D, Doerge RW, Korshunova Y. et al.Comprehensive DNA methylation profiling in a human cancer genome identifies novel epigenetic targets. Carcinogenesis. 2006;27(12):2409–2423. doi: 10.1093/carcin/bgl161. - DOI - PubMed
1. Ballestar E, Paz MF, Valle L, Wei S, Fraga MF, Espada J, Cigudosa JC, Huang TH-M, Esteller M. Methyl-CpG binding proteins identify novel sites of epigenetic inactivation in human cancer. The EMBO journal. 2003;22(23):6335–6345. doi: 10.1093/emboj/cdg604. - DOI - PMC - PubMed
1. Esteller M. Cancer epigenomics: DNA methylomes and histone-modification maps. Nat Rev Genet. 2007;8(4):286–298. doi: 10.1038/nrg2005. - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

Affiliation

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources