Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 27;14(Suppl 1):11-22.
doi: 10.4137/CIN.S13972. eCollection 2015.

Nonparametric Tests for Differential Histone Enrichment with ChIP-Seq Data

Affiliations

Nonparametric Tests for Differential Histone Enrichment with ChIP-Seq Data

Qian Wu et al. Cancer Inform. .

Abstract

Chromatin immunoprecipitation sequencing (ChIP-seq) is a powerful method for analyzing protein interactions with DNA. It can be applied to identify the binding sites of transcription factors (TFs) and genomic landscape of histone modification marks (HMs). Previous research has largely focused on developing peak-calling procedures to detect the binding sites for TFs. However, these procedures may fail when applied to ChIP-seq data of HMs, which have diffuse signals and multiple local peaks. In addition, it is important to identify genes with differential histone enrichment regions between two experimental conditions, such as different cellular states or different time points. Parametric methods based on Poisson/negative binomial distribution have been proposed to address this differential enrichment problem and most of these methods require biological replications. However, many ChIP-seq data usually have a few or even no replicates. We propose a nonparametric method to identify the genes with differential histone enrichment regions even without replicates. Our method is based on nonparametric hypothesis testing and kernel smoothing in order to capture the spatial differences in histone-enriched profiles. We demonstrate the method using ChIP-seq data on a comparative epigenomic profiling of adipogenesis of murine adipose stromal cells and the Encyclopedia of DNA Elements (ENCODE) ChIP-seq data. Our method identifies many genes with differential H3K27ac histone enrichment profiles at gene promoter regions between proliferating preadipocytes and mature adipocytes in murine 3T3-L1 cells. The test statistics also correlate with the gene expression changes well and are predictive to gene expression changes, indicating that the identified differentially enriched regions are indeed biologically meaningful.

Keywords: kernel smoothing; nonparametric testing; normalization; spatial histone profiles.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Histograms of two test statistics for the mouse adipogenesis ChIP-seq data, (A) Z0λ and (B) Z0λ,H W, for 9,874 genes with the maximum number of read counts in both day −2 and day 7 fewer than 5. The red curve in each plot represents the standard normal density.
Figure 2
Figure 2
Observed mouse adipogenesis ChIP-seq bin-counts for top 12 genes ranked by the test statistics Z0λ,W H over the promoter region for day −2 (red) and day 7 (black). Vertical line represents the transcription starting site.
Figure 3
Figure 3
Observed ChIP-seq bin-counts over the promoter region for day −2 (red) and day 7 (black) for 12 genes with large Z,W H but small fold changes. Vertical line represents the transcription starting site.
Figure 4
Figure 4
Plots of gene expression fold changes as a function of two different test statistics. Top: proposed smoothing-kernel test statistics; bottom: fold changes. Left panel: genes with enriched H3K27ac binding at day −2; right panel: genes with enriched H3K27ac binding at day 7.
Figure 5
Figure 5
Plots of proportions of up/down-regulated genes in different intervals of the test statistics for the mouse adipogenesis ChIP-seq data, (A)–(B): proposed smoothing-kernel test statistics; (C)–(D): fold change statistics; (A), (C): genes with enriched H3 K27ac at day −2; (B), (D): genes with enriched H3 K27ac at day 7.
Figure 6
Figure 6
Model fit (left panel) and prediction (right panel) for log of the gene expression fold changes using the proposed statistics Z,H W (top panel) and fold changes (bottom panel) of six histone-modification ChIP-seq data in promoter, gene body, and downstream region.
Figure 7
Figure 7
Histogram of the test statistics Zλi,WH with the different bandwidths: (A) λ1 = 5/280, (B) λ2 = 20/280, (C) λ3 = 60/280, (D) λ4 = 90/280 for 9,874 genes with the maximum number of read count in both day −2 and day 7 fewer than 5 in mouse adipogenesis ChIP-seq data.
Figure 8
Figure 8
Top: Histogram of differential enrichment test statistics Znew between two biological replicates of the ENCODE data for all 23,807 genes. Bottom: Histogram of differential enrichment test statistics Znew between two cell types (B-lymphoblastoid cell vs HeLa-S3 cervical carcinoma cells) of the ENCODE data for all 23,807 genes. The red curve represents the standard normal density.

Similar articles

Cited by

References

    1. Park P. ChIP-Seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10:669–80. - PMC - PubMed
    1. Johnson D, Mortazavi A, Myers R, Wold B. Genome-wide mapping of in vivoprotein-DNA interactions. Science. 2007;316:1497. - PubMed
    1. Mikkelsen T, Xu Z, Zhang X, et al. Comparative epigenomic analysis of murine and human adipogenesis. Cell. 2010;143:156–69. - PMC - PubMed
    1. Mortazavi A, Williams B, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8. - PubMed
    1. Barski A, Cuddapah S, Cui K, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–37. - PubMed

LinkOut - more resources