Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 27;4(2):lqac041.
doi: 10.1093/nargab/lqac041. eCollection 2022 Jun.

ePeak: from replicated chromatin profiling data to epigenomic dynamics

Affiliations

ePeak: from replicated chromatin profiling data to epigenomic dynamics

Maëlle Daunesse et al. NAR Genom Bioinform. .

Abstract

We present ePeak, a Snakemake-based pipeline for the identification and quantification of reproducible peaks from raw ChIP-seq, CUT&RUN and CUT&Tag epigenomic profiling techniques. It also includes a statistical module to perform tailored differential marking and binding analysis with state of the art methods. ePeak streamlines critical steps like the quality assessment of the immunoprecipitation, spike-in calibration and the selection of reproducible peaks between replicates for both narrow and broad peaks. It generates complete reports for data quality control assessment and optimal interpretation of the results. We advocate for a differential analysis that accounts for the biological dynamics of each chromatin factor. Thus, ePeak provides linear and nonlinear methods for normalisation as well as conservative and stringent models for variance estimation and significance testing of the observed marking/binding differences. Using a published ChIP-seq dataset, we show that distinct populations of differentially marked/bound peaks can be identified. We study their dynamics in terms of read coverage and summit position, as well as the expression of the neighbouring genes. We propose that ePeak can be used to measure the richness of the epigenomic landscape underlying a biological process by identifying diverse regulatory regimes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The ePeak workflow. (A) Five ePeak modules executing specific and interdependent tasks. Stop signs indicate where the analysis ends, depending on the data provided by the user. Stop 1 if no replicates are available. Stop 2 for datasets with replicates for only one condition. Stop 3 when replicates are available for two or more conditions. (B) ePeak Snakemake rule graph illustrating the input/output dependencies between steps. Border colour indicates the module membership of each rule (SPR: self pseudo-replicate, PPR: pooled pseudo-replicate).
Figure 2.
Figure 2.
ChIP-seq position variability. Summit instability is defined as the distance between the summit of the peak called in each replicate and the corresponding reproducible peak. Tracks show the IP coverage, peak and summit position for the two replicates separately (top) and pooled (bottom) in two genomic regions. Vertical blue lines indicate the position of the reproducible peak summit and horizontal black lines the distance to the corresponding summit in each replicate.
Figure 3.
Figure 3.
Comparison of the statistical settings for differential analysis. (A) Kernel density estimation of read coverage and summit instability for all reproducible peaks across replicates of the two biological conditions under study. (B) Read coverage and summit position variability estimation using the Fquantro statistic (16). (C) Proportion of total differentially marked/bound peaks obtained using each statistical setting. DESeq2 = DESeq2 with geometric mean normalisation; NL-L = limma with nonlinear and with linear normalisation; NL = limma with nonlinear normalisation only; L = limma with linear normalisation only. (D, E) Quantitative characterisation of differentially marked/bound peak populations obtained using each statistical setting. Distribution of ChIP-seq read counts dispersion as estimated by limma (D). Distribution of ChIP-seq absolute changes in read counts between shUbc9 and shControl (E). Colours correspond to panel C. (F) Expression dynamics of genes neighbouring differentially marked/bound peak populations. Distribution absolute changes in RNA-seq read counts between shUbc9 and shControl. Colours correspond to panel (C).

References

    1. Barski A., Cuddapah S., Cui K., Roh T.-Y., Schones D. E., Wang Z., Wei G., Chepelev I., Zhao K.. High-resolution profiling of histone methylations in the human genome. Cell. 2007; 129:823–837. - PubMed
    1. Johnson D.S., Mortazavi A., Myers R.M., Wold B.. Genome-wide mapping of in vivo protein-DNA interactions. Science (New York, N.Y.). 2007; 316:1497–1502. - PubMed
    1. Skene P.J., Henikoff S.. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife. 2017; 6:e21856. - PMC - PubMed
    1. Janssens D.H., Meers M.P., Wu S.J., Babaeva E., Meshinchi S., Sarthy J.F., Ahmad K., Henikoff S.. Automated CUT&Tag profiling of chromatin heterogeneity in mixed-lineage leukemia. Nat. Genetics. 2021; 53:1586–1596. - PMC - PubMed
    1. Landt S.G., Marinov G.K., Kundaje A., Kheradpour P., Pauli F., Batzoglou S., Bernstein B.E., Bickel P., Brown J.B., Cayting P.et al. .. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012; 22:1813–1831. - PMC - PubMed