Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 18:6:6370.
doi: 10.1038/ncomms7370.

Epigenomic footprints across 111 reference epigenomes reveal tissue-specific epigenetic regulation of lincRNAs

Affiliations

Epigenomic footprints across 111 reference epigenomes reveal tissue-specific epigenetic regulation of lincRNAs

Viren Amin et al. Nat Commun. .

Abstract

Tissue-specific expression of lincRNAs suggests developmental and cell-type-specific functions, yet tissue specificity was established for only a small fraction of lincRNAs. Here, by analysing 111 reference epigenomes from the NIH Roadmap Epigenomics project, we determine tissue-specific epigenetic regulation for 3,753 (69% examined) lincRNAs, with 54% active in one of the 14 cell/tissue clusters and an additional 15% in two or three clusters. A larger fraction of lincRNA TSSs is marked in a tissue-specific manner by H3K4me1 than by H3K4me3. The tissue-specific lincRNAs are strongly linked to tissue-specific pathways and undergo distinct chromatin state transitions during cellular differentiation. Polycomb-regulated lincRNAs reside in the bivalent state in embryonic stem cells and many of them undergo H3K27me3-mediated silencing at early stages of differentiation. The exquisitely tissue-specific epigenetic regulation of lincRNAs and the assignment of a majority of them to specific tissue types will inform future studies of this newly discovered class of genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Clustering of epigenomes.
(a) Hierarchical clustering of 99 cell and tissue types was performed using 40 different combinations of five histone modifications (columns) and eight groups of regions of interests (rows). A high score for a particular combination (histone mark—region of interest score—number rounded to a single decimal place) indicates that the subtrees of a tree constructed using the combination are frequently confirmed across 40 combinations. The combinations with highest scores are deemed most informative and are highlighted in yellow. Column bars and row bars indicate average informativeness of specific histone modifications and regions of interests, respectively. (b) Circular dendrogram constructed using the top-scoring combination indicated by yellow circle in panel a (average H3K4me1 histone modification signal over 3,000 bp windows centred on lincRNA transcription start sites). The highlighted sections in the dendrogram correspond to fourteen major clusters and nine groups of clusters (indicated by nine distinct colours) corresponding to related cell- and tissue-types. The coloured sections correspond to clusters that are highly reproducible (high bootstrap scores, see Supplementary Fig. 2) and can be derived using different combinations of histone marks and regions of interest.
Figure 2
Figure 2. Lineage-specific regulatory regions and associated phenotypes.
(a) Lineage-specific regulatory regions were determined by comparing epigenomes within a cluster against the epigenomes outside the cluster using linear regression fit modelling (LIMMA, P value<0.05). Pie chart shows percentage of regions that harbour cluster-specific marks, some unique to the cluster, some shared by two or three related subtrees, and highlights regulatory regions that are less specifically modified (grey). (b) Distribution of regulatory regions that are unique and shared for each cluster. (c) Enrichment of mouse phenotype terms associated with lineage specific regulators calculated using a GREAT tool’s binomial approach (lineages representing all three germ layers were selected).
Figure 3
Figure 3. Epigenomic footprints of lincRNA transcription start sites in the H1 embryonic stem cell.
lincRNA transcription start sites (lincRNA TSSs) belong to five distinct chromatin state classes. (a) LincRNA TSS that had differential histone modification signals (in at least one histone mark) in stem cells were used to perform Spark analysis. Spark performs k-means clustering (k=5, bin size=100 bps) to group regions that have similar epigenomic footprint. Clustering analysis reveals five distinct classes of lincRNA TSS for H1 stem cells: quiescent (C1: Quies), enhancer (C2: Enh), transcription start site active (C3: TssA), bivalent (C4: Biv) and quiescent/heterochromatin (C5: Quies/Het). Each Spark cluster was subjected to further analyses (bf). (b) Absolute distance of lincRNA TSS to the nearest protein-coding TSS determined using GREAT basal+extension rule (1 kb downstream+5 kb upstream+up to 500 kb distal). The absolute distances are binned into <5 kb, 5–50 kb and >50–500 kb windows. (c) Evolutionary age estimates of lincRNA based on sequence conservation. (d) ChromHMM state enrichments of the lincRNA TSS clusters. (e) Density function showing expression of lincRNA (left) and neighbouring protein-coding genes in RPKM (reads per kilobase per million) units. (f) Enrichment of ENCODE transcription factor binding sites for bivalent lincRNA TSS clusters (hypergeometric tests, P<0.0005). Gene ontology terms (blue—biological process, red—mouse phenotype and green—mouse genomic institute (MGI) expression) enrichment of neighbouring protein-coding genes for the bivalent lincRNA TSS cluster. Terms identified using GREAT are significant by both hypergeometric and binomial tests (P<0.05).
Figure 4
Figure 4. Coordinated changes of chromatin states at lincRNA TSSs during T-cell differentiation.
(a) Spark clustering reveals coordinated changes in histone marks between human embryonic stem cells (H1), haematopoietic stem cells (CD34+) and T-lymphocyte cells (CD8+). Black bar plot indicates number of lincRNA TSS that show specific pattern of epigenetic programming across the three developmental time points. (b) Absolute distance of lincRNA TSS to the nearest protein-coding TSS determined using GREAT basal+extension rule (1 kb downstream+5 kb upstream+upto 500 kb distal). The absolute distances are binned into <5 kb, 5–50 kb and >50–500 kb windows. (c) ChromHMM-defined states transition between embryonic stem cells (ESCs) to haematopoietic stem cells (HSCs) and from HSCs to T cells were mapped for lincRNA TSS in C2. Size of the node reflects the number of states and edge width reflects the number of transitions. Transitions >30% relative to each state are shown in the arc diagram. (d) Bar plot showing Shannon entropy calculated for all states, Polycomb states or non-Polycomb states for the three developmental time points (ESCs, HSCs and T cells).

References

    1. Hu W., Alvarez-Dominguez J. R. & Lodish H. F. Regulation of mammalian cell differentiation by long non-coding RNAs. EMBO Rep. 13, 971–983 (2012). - PMC - PubMed
    1. Guttman M. et al.. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature 477, 295–300 (2011). - PMC - PubMed
    1. Grote P. et al.. The tissue-specific lncRNA Fendrr is an essential regulator of heart and body wall development in the mouse. Dev. Cell 24, 206–214 (2013). - PMC - PubMed
    1. Klattenhoff C. A. et al.. Braveheart, a long noncoding RNA required for cardiovascular lineage commitment. Cell 152, 570–583 (2013). - PMC - PubMed
    1. Kretz M. et al.. Suppression of progenitor differentiation requires the long noncoding RNA ANCR. Genes Dev. 26, 338–343 (2012). - PMC - PubMed

Publication types

Substances