Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 28;36(5):809-19.
doi: 10.1128/MCB.00955-15.

Chromatin and RNA Maps Reveal Regulatory Long Noncoding RNAs in Mouse

Affiliations

Chromatin and RNA Maps Reveal Regulatory Long Noncoding RNAs in Mouse

Gireesh K Bogu et al. Mol Cell Biol. .

Abstract

Discovering and classifying long noncoding RNAs (lncRNAs) across all mammalian tissues and cell lines remains a major challenge. Previously, mouse lncRNAs were identified using transcriptome sequencing (RNA-seq) data from a limited number of tissues or cell lines. Additionally, associating a few hundred lncRNA promoters with chromatin states in a single mouse cell line has identified two classes of chromatin-associated lncRNA. However, the discovery and classification of lncRNAs is still pending in many other tissues in mouse. To address this, we built a comprehensive catalog of lncRNAs by combining known lncRNAs with high-confidence novel lncRNAs identified by mapping and de novo assembling billions of RNA-seq reads from eight tissues and a primary cell line in mouse. Next, we integrated this catalog of lncRNAs with multiple genome-wide chromatin state maps and found two different classes of chromatin state-associated lncRNAs, including promoter-associated (plncRNAs) and enhancer-associated (elncRNAs) lncRNAs, across various tissues. Experimental knockdown of an elncRNA resulted in the downregulation of the neighboring protein-coding Kdm8 gene, encoding a histone demethylase. Our findings provide 2,803 novel lncRNAs and a comprehensive catalog of chromatin-associated lncRNAs across different tissues in mouse.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Overview of the lncRNA discovery and chromatin state map computational pipeline. (A) Overview of the lncRNA discovery and chromatin state map-based classification pipeline that was employed using both RNA-seq and ChIP-seq data from 8 tissues and one primary cell line (ES) in mouse. RNA-seq reads from all the tissues and the cell line were mapped using TopHat 2 against the mouse reference genome (mm9), and transcriptomes were assembled de novo using Cufflinks 2 and Scripture v4 assemblers. Common transcripts that were assembled by both Cufflinks 2 and Scripture v4 were scanned for lncRNA features like size, length, exon number, expression, and coding score. A library of intergenic lncRNAs was constructed by pooling lncRNAs identified in this study and previous studies. In total, 10,728 unique lncRNAs were overlapped with chromatin state maps discovered by using ChromHMM by pooling various ChIP-seq data sets and classified chromatin-associated lncRNAs in mouse. (B) Overlap between lncRNAs identified in this study (small circle) and previously published lncRNAs (large circle; UCSC/Ensembl/RefSeq [5, 17, 20, 34–38]). A total of 2,803 nonannotated lncRNAs were identified, and 34% (13,382) of the known lncRNAs were recovered in this study. (C) RNA-seq coverage tracks showing the expression of a novel lncRNA identified in this study (black). Transcription in testes is shown. “+” and “−” indicate sense and antisense directions, respectively, and experimental replicates are numbered 1 and 2.
FIG 2
FIG 2
Tissue- and cell-specific expression of lncRNAs. (A) Heat map representing normalized FPKM expression values of the 2,803 lncRNAs (rows) across eight tissues and a primary cell line (columns). The rows and columns were ordered based on k means clustering. The color intensity represents the fractional density across the row of log10-normalized FPKM expression values as estimated by Scripture v4. Each tissue has 2 columns, representing its replicates, and the ES cell line has 5 columns. (B) Experimentally validated examples of lncRNAs with tissue-specific expression across heart, liver, and kidney. Shown are qRT duplicate normalized (against the GAPDH housekeeping gene) expression levels of heart-specific lncRNAs (H-lnc1 and H-lnc2), liver-specific lncRNAs (L-lnc1 and L-lnc2), and kidney-specific lncRNAs (K-lnc1 and K-lnc2) (see Table S9 in the supplemental material). The error bars indicate standard deviations.
FIG 3
FIG 3
Discovery of chromatin state maps and their association with lincRNAs. (A) Emission parameters learned de novo with ChromHMM on the basis of combinations recurring in chromatin. Each point in the table denotes the frequency with which a given mark is found at genomic positions corresponding to a specific chromatin state. The observation frequencies of various chromatin marks, including H3K36me3, H3K4me1, H3K27ac, Pol II, H3K4me3, CTCF, and H3K27me3, as well as respective inputs showing 6 major chromatin states, including active promoter (red), poised promoter (purple), enhancer (yellow), Polycomb (gray), insulator (blue), and heterochromatin (white), are presented. (B) Percentages of protein-coding TSSs (top) and intergenic lncRNAs (bottom) significantly enriched with both active promoter and strong enhancer (***, P < 0.001; NS, not significant; Fisher exact test). D, observed data; R, randomized TSSs. (C) Percentages of lncRNAs and protein-coding genes that are associated with promoter and enhancer chromatin states. (D) Numbers of plncRNAs and elncRNAs across 8 tissues and an ES cell line. (E) Percentages of lncRNAs (overlapping both CAGE peaks and DNase I hypersensitive sites) associated with promoter and enhancer chromatin states.
FIG 4
FIG 4
Transcript length, expression, and transition of chromatin-associated lncRNAs in mouse. (A) Transcript lengths of elncRNAs (median = 6,565 nt) and plncRNAs (median = 6,450 nt) across eight tissues and a cell line, showing no difference in length (Mann-Whitney test; NS, not significant; P = 0.9848). (B) Log-normalized expression (FPKM) of elncRNAs (median = 0.08 FPKM) and plncRNAs (median = 0.33 FPKM) across eight tissues and an ES cell line, showing a significant difference between them (Mann-Whitney test; ***, P = 1.221e−10). (C) Circos plot showing the transition of plncRNA to elncRNA, or elncRNA to plncRNA, across eight tissues and an ES cell line. The outer bars indicate the total numbers of chromatin-associated lncRNAs that undergo a transition per tissue or cell line, which included whole brain (20 plncRNAs and 72 elncRNAs), ES cells (62 plncRNAs and 8 elncRNAs), heart (44 plncRNAs and 4 elncRNAs), small intestine (17 plncRNAs and 18 elncRNAs), kidney (50 plncRNAs and 24 elncRNAs), liver (46 plncRNAs and 10 elncRNAs), spleen (55 plncRNAs and 12 elncRNAs), testis (29 plncRNAs and 12 elncRNAs), and thymus (47 plncRNAs and 40 elncRNAs). The links inside the bars indicate the numbers of lncRNAs that switch their chromatin states from one tissue to another (red, plncRNAs; gold, elncRNAs). The lncRNA transition table used to generate the circos plot is shown in Table S6 in the supplemental material. (D) Percentages of chromatin-associated transitions across all the mouse tissues, showing the high percentage of plncRNA-to-plncRNA transitions compared to elncRNA-to-elncRNA transitions.
FIG 5
FIG 5
An enhancer-associated lncRNA, lncRNA-Kdm8, regulates the expression of a neighboring protein-coding gene, Kdm8. (A) The lncRNA-Kdm8 locus promoter overlaps an enhancer chromatin state and occurs within 20 kb of the TSS of a protein-coding gene, Kdm8 (e.g., it is an enhancer-associated lncRNA). The gene tracks represent DNase I hypersensitive sites (HS) and ChIP-seq data for H3K4me1, H3K27ac, and H3K4me3 from ENCODE. The genomic scale is indicated at the top and the scale of both DNase I HS and ChIP-seq data on the upper right. (B and C) The 5′ and 3′ ends and the exon-intron boundaries of the enhancer-associated lncRNA, lncRNA-Kdm8, were determined by RACE (see the supplemental material). The black arrows depict TSSs and the directions of transcription for the respective genes. Kdm8 mRNA and lncRNA-Kdm8 are shown in green and red, respectively. The genomic DNA sequences corresponding to the 5′ and 3′ ends of the cloned lncRNA are shown in black below the lncRNA-Kdm8 gene track, defining accurate 5′-end and exon-intron boundaries for exon 1 (E1), exon 3, exon 4, and exon 5 of lncRNA-Kdm8. (D) Expression levels of lncRNA-Kdm8 in mouse ES cells and other tissues, as measured by directional RNA-seq and expressed as FPKM. (E) qRT-PCR expression (triplicates, normalized against the RPO housekeeping gene) after siRNA-based knockdown of lncRNA-Kdm8 (chr7: 132560406 to 132561472 [–]) resulted in a significant decrease of the neighboring gene, Kdm8 (t test; *, P ≤ 0.05; **, P ≤ 0.01), which was not observed for the negative control of the distant coding gene, Taf3 (chr2: 9836179 to 9970236 [+]). The primers used for siRNA oligonucleotides of lncRNA-Kdm8 are given in Table S9 in the supplemental material. The error bars indicate standard deviations.

References

    1. Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, Epstein CB, Frietze S, Harrow J, Kaul R, Khatun J, Lajoie BR, Landt SG, Lee B-K, Pauli F, Rosenbloom KR, Sabo P, Safi A, Sanyal A, Shoresh N, Simon JM, Song L, Trinklein ND, Altshuler RC, Birney E, Brown JB, Cheng C, Djebali S, Dong X, Dunham I, Ernst J, Furey TS, Gerstein M, Giardine B, Greven M, Hardison RC, Harris RS, Herrero J, Hoffman MM, Iyer S, Kellis M, Khatun J, Kheradpour P, Kundaje A, Lassmann T, Li Q, Lin X, Marinov GK, Merkel A, Mortazavi A, et al. . 2012. An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. doi:10.1038/nature11247. - DOI - PMC - PubMed
    1. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, Derrien T, Drenkow J, Dumais E, Dumais J, Duttagupta R, Falconnet E, Fastuca M, Fejes-Toth K, Ferreira P, Foissac S, Fullwood MJ, Gao H, Gonzalez D, Gordon A, Gunawardena H, Howald C, Jha S, Johnson R, Kapranov P, King B, Kingswood C, Luo OJ, Park E, Persaud K, Preall JB, Ribeca P, Risk B, Robyr D, Sammeth M, Schaffer L, See L-H, Shahab A, Skancke J, Suzuki AM, Takahashi H, Tilgner H, Trout D, Walters N, Wang H, Wrobel J, Yu Y, Ruan X, Hayashizaki Y, Harrow J, Gerstein M, Hubbard T, Reymond A, Antonarakis SE, Hannon G, Giddings MC, Ruan Y, Wold B, Carninci P, Guigó R, Gingeras TR. 2012. Landscape of transcription in human cells. Nature 489:101–108. doi:10.1038/nature11233. - DOI - PMC - PubMed
    1. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigo R. 2012. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22:1775–1789. doi:10.1101/gr.132159.111. - DOI - PMC - PubMed
    1. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. 2011. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev 25:1915–1927. doi:10.1101/gad.17446611. - DOI - PMC - PubMed
    1. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein BE, Kellis M, Regev A, Rinn JL, Lander ES. 2009. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458:223–227. doi:10.1038/nature07672. - DOI - PMC - PubMed

Publication types

LinkOut - more resources