Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov;40(21):10642-56.
doi: 10.1093/nar/gks848. Epub 2012 Sep 18.

Chromatin signature discovery via histone modification profile alignments

Affiliations

Chromatin signature discovery via histone modification profile alignments

Jianrong Wang et al. Nucleic Acids Res. 2012 Nov.

Abstract

We report on the development of an unsupervised algorithm for the genome-wide discovery and analysis of chromatin signatures. Our Chromatin-profile Alignment followed by Tree-clustering algorithm (ChAT) employs dynamic programming of combinatorial histone modification profiles to identify locally similar chromatin sub-regions and provides complementary utility with respect to existing methods. We applied ChAT to genomic maps of 39 histone modifications in human CD4(+) T cells to identify both known and novel chromatin signatures. ChAT was able to detect chromatin signatures previously associated with transcription start sites and enhancers as well as novel signatures associated with a variety of regulatory elements. Promoter-associated signatures discovered with ChAT indicate that complex chromatin signatures, made up of numerous co-located histone modifications, facilitate cell-type specific gene expression. The discovery of novel L1 retrotransposon-associated bivalent chromatin signatures suggests that these elements influence the mono-allelic expression of human genes by shaping the chromatin environment of imprinted genomic regions. Analysis of long gene-associated chromatin signatures point to a role for the H4K20me1 and H3K79me3 histone modifications in transcriptional pause release. The novel chromatin signatures and functional associations uncovered by ChAT underscore the ability of the algorithm to yield novel insight on chromatin-based regulatory mechanisms.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Scheme of the ChAT algorithm. (A) For a series of formula image genomic regions, combinatorial histone modification distributions are represented by ChIP-seq profile matrices. Each genomic region under consideration is divided into 200 bp non-overlapping bins and each bin is associated with a column vector (formula image) summarizing the ChIP-seq tag counts for formula image different histone modifications. The contiguous landscape of each individual histone modification along the genomic region is represented by the corresponding row vector (formula image). (B) Histone modification ChIP-seq tag counts are smoothed and transformed to produce normalized scores. (C) Dynamic programming is used to identify sub-regions with similar chromatin signatures. For each pair of genomic regions, a local dynamic programming algorithm is used to compare column vectors formula image vs. formula image (i.e. the combinatorial histone modification signatures of individual genomic bins), and the best alignment path (red) is identified. (D) Pairwise P-values are computed based on a null distribution of high-scoring chromatin segment pairs (islands) found between unrelated genomic regions. Dynamic programming is used to identify high-scoring islands (grey lines), and the score distributions of the islands are used to estimate the parameters of extreme-value distributions for P-value calculation. (E) Pairwise P-values are organized into a distance matrix that is used for hierarchical clustering of similar chromatin sub-regions. The resulting tree of chromatin signatures can be partitioned using an explicit P-value threshold (purple line) to identify groups of related signatures.
Figure 2.
Figure 2.
TSS-associated chromatin signatures. (A) A TSS-associated signature based on enrichment of H3K4me3 is represented as a heatmap (yellow, high; blue, low levels of modification) and an enrichment profile showing the average modification scores across the signature. H3K4me3 tag counts (red) are shown for an instance of this signature at a human promoter locus. (B) A TSS-associated signature composed of five active histone modifications along with an example of this pattern seen at a divergent promoter locus. (C) A bivalent TSS-associated signature with three active modifications and one repressive modification (H3K27me3). Distributions of the active (red) and repressive (blue) histone modification tag counts are shown for a single promoter locus.
Figure 3.
Figure 3.
Differential gene expression associated with specific TSS chromatin signatures. (A) Median CD4+ T-cell expression levels (±1 quartile) of genes with TSS marked by 36 distinct chromatin signatures. Bivalent TSS signatures (blue bars) correspond to lower overall expression levels than active signatures (orange bars). (B) Cell-type specific gene expression patterns associated with different TSS chromatin signatures. Gene expression levels across 79 cell types (red, high; green, low) are shown for genes with TSS marked by a bivalent signature versus genes with TSS marked by an active signature. Expression differences are most pronounced for the indicated T cells and B cells.
Figure 4.
Figure 4.
Cell-type specific expression associated with complex chromatin signatures. (A) Average (±SD) expression levels (blue, T- or B-cell expression; grey, other cell-type expressions) of genes with TSS marked by two different chromatin signatures (s1 and s2). (B) Enrichment profiles showing the average histone modification scores across signature s1. (C) Enrichment profiles showing the average histone modification scores across signature s2. (D) Box plots showing T- or B-cell specific expression level distributions for different sets of chromatin signatures.
Figure 5.
Figure 5.
TTS-associated chromatin signatures. TTS signatures associated with three (A) and two (B) histone modification combinations are shown (histone modification representations described as for Figure 2). (C) A specific TTS proximal locus showing adjacent locations of each of these two patterns. (D) Pol II enrichment profile within genomic regions marked by the signature shown in (A). (E) Pol II enrichment profile within genomic regions marked by the signature shown in (B).
Figure 6.
Figure 6.
Enhancer-associated chromatin signatures. (A) ∼100 kb genomic region with three locations (black bars) marked by a specific enhancer-associated signature composed of co-located peaks of H3K4me1, H3K4me3, H3K27ac and H3K36ac (ChIP-seq tag counts in red). All of the three locations overlap with p300 binding sites. (B) Histone modification enrichment profiles of an enhancer-associated mono-modal signature. (C) Enrichment profiles of an enhancer-associated bi-modal signature. Histone modification representations are as described for Figure 2.
Figure 7.
Figure 7.
CNE-associated chromatin signatures. (A) Distribution of FEs of CNEs for all small-sized signatures. (B) Histone modification enrichment profiles (as described for Figure 2) for a repressive signature highly enriched within CNEs. (C) Cell-type specific expression levels for genes proximal to CNEs bearing the repressive signature shown in (B). (D) Distribution of the ratios of T- or B-cell average expressions and other cell type average expressions for genes shown in (C) (observed, red; expected, grey). Observed ratios are significantly smaller than expected ratios calculated from gene expression levels randomly simulated across cell-types and tissues (P = 1.3 × 10−10, Mann–Whitney test).
Figure 8.
Figure 8.
A bivalent chromatin signature associated with L1 retrotransposons. (A) Histone modification enrichment profiles (as described for Figure 2) for the bivalent signature. (B) A single genomic region with three locations marked by the L1 characteristic bivalent signature. ChIP-seq tag counts are shown for the active mark H3K4me3 (red) and the repressive mark H3K9me3 (blue).
Figure 9.
Figure 9.
Large-sized chromatin signatures associated with gene bodies. (A, B) Histone modification enrichment profiles (as described for Figure 2) are shown for two chromatin signatures composed of the same constituent modifications and spatial patterns with distinct sizes. (C) Specific instances of each signature co-located with human gene bodies are shown with modification ChIP-seq tag counts in red and RNA-seq tag counts in black. (D) Percentage of these two large-sized signatures that overlapping with gene bodies (grey, any coverage; blue >50% coverage; orange >80% coverage; red >95% coverage of the gene body). (E) Two examples where signature B is co-located with individual genomic regions that are annotated as intergenic but show evidence of being genic from RNA-seq and spliced EST data. (F) Average CD4+ T-cell expression levels for genes marked by signatures A and B.
Figure 10.
Figure 10.
Transcriptional pause release associated with H4K20me1 and H3K79me3. The ratio of Pol II density downstream of TSS (+1 to +5 kb) over its density around TSS (−1 to +1 kb) is positively correlated with the density of downstream H4K20me1 (A, Spearman’s ρ = 0.54) and H3K79me3 (B, Spearman’s ρ = 0.51).

Similar articles

Cited by

References

    1. Strahl BD, Allis CD. The language of covalent histone modifications. Nature. 2000;403:41–45. - PubMed
    1. Barski A, Chepelev I, Liko D, Cuddapah S, Fleming AB, Birch J, Cui K, White RJ, Zhao K. Pol II and its associated epigenetic marks are present at Pol III-transcribed noncoding RNA genes. Nat. Struct. Mol. Biol. 2010;17:629–634. - PMC - PubMed
    1. Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837. - PubMed
    1. Eaton ML, Prinz JA, MacAlpine HK, Tretyakov G, Kharchenko PV, MacAlpine DM. Chromatin signatures of the Drosophila replication program. Genome Res. 2011;21:164–174. - PMC - PubMed
    1. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458:223–227. - PMC - PubMed

Publication types