Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 5;22(1):136.
doi: 10.1186/s13059-021-02350-x.

Intergenic RNA mainly derives from nascent transcripts of known genes

Affiliations

Intergenic RNA mainly derives from nascent transcripts of known genes

Federico Agostini et al. Genome Biol. .

Abstract

Background: Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear.

Results: We hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the "fuzzy" transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome.

Conclusions: We provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways.

Keywords: Gene annotation; RNA; RNA-seq; Transcription.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flow chart of the data analysis pipeline. Schematic describing the main data processing steps, intermediate and final outputs of the analysis pipeline, applied to RNA-seq (left side) and other sequencing (NET-seq, right side) data. Procedures (blue) and tools (orange) are indicated
Fig. 2
Fig. 2
General features of newly identified transcriptional units (TUs). a Schematic representation of the gene-associated (green) and independent (yellow) transcriptional units annotated in this study. b Upper panels: genome browser views of nuclear RNA-seq signals in HeLa cells, for example, TUs (red and blue indicate RNA-seq reads mapping to the sense and antisense strands, respectively). Lower panels: genomic annotations of pre-existing genes and newly identified TUs; horizontal line divides the features on the sense (S) and antisense (A) orientations. Coverage is reported at 1× depth (reads per genome coverage (RPGC)). c Comparison of the number of annotated and newly identified transcripts detected in the current RNA-seq dataset (TPM ≥ 1). d Comparison of transcript lengths. e Proportions of uniquely mapping RNA-seq reads originating from different transcript types for the whole cell (left) and nuclear (right) subcellular fractions of HeLa cells. f Distributions of expression levels of annotated and newly identified TUs for the chromatin-associated (left panel) and nucleoplasm (right panel) subcellular fractions of HeLa cells
Fig. 3
Fig. 3
Meta-profiles of transcriptional measurements around gene-associated TUs. Meta-profiles of transcriptional measurements plotted relative to the start positions of UoGs and their associated protein-coding genes (left-hand panels) and relative to the end positions of DoGs and their associated genes (right-hand panels). a RNA-seq measurements in different subcellular compartments. b CAGE-seq measurements in the sense and antisense strands. c NET-seq measurements for different Pol II CTD modifications. d ChIP-seq measurements for histone marks and EP300 occupancy-associated transcriptional activities
Fig. 4
Fig. 4
Meta-profiles of transcriptional measurements around independent TUs. Meta-profiles of transcriptional measurements plotted relative to the start and end positions of independent TUs and control long non-coding RNA genes. ad as in Fig. 3
Fig. 5
Fig. 5
Impact of nuclease depletion on TU expression. a Expression levels of protein-coding genes and TUs in the chromatin and nucleoplasm fractions. b Relative nucleoplasmic-to-chromatin expression levels in response to EXOSC3 knockdown and control siLuc treatments. c Expression levels in CSTF2+CSTF2T and CPSF3 knockdowns relative to control in the chromatin fraction. d Expression levels in XRN2 knockdown (via activation of the auxin-inducible degron system) and basal (uninduced; minus auxin) treatments relative to unmodified XRN2 control in the nuclear fraction. p values were calculated using the two-sided Wilcoxon rank sum test, with asterisks indicating statistical significance at the following thresholds: nsp > 0.05; *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001; ****p ≤ 0.0001

References

    1. Jacquier A. The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat Rev Genet. 2009;10(12):833–844. doi: 10.1038/nrg2683. - DOI - PubMed
    1. Hangauer MJ, Vaughn IW, McManus MT. Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet. 2013;9(6):e1003569. doi: 10.1371/journal.pgen.1003569. - DOI - PMC - PubMed
    1. Jensen TH, Jacquier A, Libri D. Dealing with pervasive transcription. Mol Cell. 2013;52(4):473–484. doi: 10.1016/j.molcel.2013.10.032. - DOI - PubMed
    1. Porrua O, Libri D. Transcription termination and the control of the transcriptome: why, where and how to stop. Nat Rev Mol Cell Biol. 2015;16(3):190–202. doi: 10.1038/nrm3943. - DOI - PubMed
    1. Preker P, Nielsen J, Kammler S, Lykke-Andersen S, Christensen MS, Mapendano CK, Schierup MH, Jensen TH. RNA exosome depletion reveals transcription upstream of active human promoters. Science. 2008;322(5909):1851–1854. doi: 10.1126/science.1164096. - DOI - PubMed

Publication types