Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 6;489(7414):101-8.
doi: 10.1038/nature11233.

Landscape of transcription in human cells

Affiliations

Landscape of transcription in human cells

Sarah Djebali et al. Nature. .

Abstract

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

PubMed Disclaimer

Figures

Figure1
Figure1. A large majority of Gencode elements are detected by RNA-seq data
Shown are Gencode detected elements in the polyadenylated and non-polyadenylated fractions of cellular compartments (cumulative counts for both RNA fractions and compartments refer to elements present in any of the fractions or compartments). Each box plot is generated from values across all cell lines, thus capturing the dispersion across cell lines. The largest point shows the cumulative value over all cell lines.
Figure2
Figure2. Co-transcriptional splicing
a. Short read mappings for exon-based splicing completion. Read mappings that allow assessment of splicing completion around exons. (a,b,c) Reads providing evidence of splicing completion for the region containing the exon (with either exon inclusion, ab, or exclusion, c) (d,e) Reads providing evidence for the splicing of the region containing the exon not being completed yet. The complete Splicing Index (coSI) is the ratio of a+b+c over a+b+c+d+e and can thus be broadly assumed to correspond to the fraction of RNA molecules in which the region containing the exon has already been spliced (see Tilgner et al.). A coSI value of 1 means splicing completed, while a value of 0 indicates that splicing has not yet been initiated. b. Distribution of coSI scores computed on Gencode internal exons: (Top) Distribution in the total chromatin RNA fraction. (Bottom) Distribution in cytosolic polyadenylated RNA fraction.
Figure 3
Figure 3. Abundance of gene types in cellular compartments
2D Kernel density plots of nuclear over cytosolic enrichment (Y axis) versus overall gene expression in the whole cell extract (X axis), for protein coding, long non-coding and novel genes over all cell lines. Only genes present in all 3 RNA extracts are displayed, as well as two representative genes (ACTG1 in red and H19 in blue), for which the expression in each individual cell line is shown. The actual values of the estimated Kernel density are indicated by contour lines and color shades.
Figure 4
Figure 4. Isoform expression within a gene
a. Number of expressed isoforms per gene per cell line. Genes tends to express many isoforms simultaneously. b. Relative expression of the most abundant isoform per gene per cell line. There is generally one dominant isoform in a given condition.
Figure 5
Figure 5. Transcription at enhancers
a. The pattern of RNA elements around enhancer predictions, containing DNase I hypersensitive (HS) sites. The lines represent the average frequency of RNA elements (top: polyadenylated long RNA contigs; middle: CAGE tag clusters; bottom: non-polyadenylated long RNA contigs) in a genomic window around the center of the enhancer prediction as determined by DNase I HS sites. Elements on the plus strand are shown in red, and on the minus strand in blue. b. Enhancer transcripts differ from promoter transcripts. The box plots compare the features of transcripts at predicted enhancer loci compared to predicted novel intergenic promoters and annotated promoters. H3k4me3, PolyA+ and Nucleus denote the 3 following ratios: H3k4me3/(H3k4me3 + H3k4me1), polyadenylated/(polyadenylated + non-polyadenylated), Nuclear/(Nuclear + Cytosolic). Enhancers are marked by higher levels of H3k4me1 compared to H3K4me3 than novel or annotated promoters (left). Enhancer transcripts show higher levels of non-polyadenylated (middle) and nuclear (right) RNA relative to promoters. c. Chromatin state at transcribed enhancers. Enhancer predictions with evidence of transcription (in blue; Cage tags present at predicted locus) show a different pattern of histone modifications and higher levels of RNA Polymerase II binding than non-transcribed predictions (red). They are enriched for H3K27 acetylation, H3K4 methylation, H3K79 di-methylation and depleted for H3K27 tri-methylation. d. Enhancer activity and transcription is cell type specific. Loci predicted to be active transcribed enhancers in GM12878 cells, show low signal for CAGE tags (top) and for H3K27 acetylation (bottom) in other cell lines.
Figure 6
Figure 6. Size distribution of intergenic regions
Novel genes increase the proportion of small intergenic regions; ig/as = intergenic / antisense.
Figure 7
Figure 7
Figure 8
Figure 8
Figure 9
Figure 9
Figure 10
Figure 10
Figure 11
Figure 11

Comment in

References

    1. Mattick JS. Long noncoding RNAs in cell and developmental biology. Semin Cell Dev Biol. 2011;22:327. doi:S1084-9521(11)00077-2 [pii] 10.1016/j.semcdb.2011.05.002. - PubMed
    1. The ENCODE (ENCyclopedia Of DNA Elements) Project Science. 2004;306:636–640. doi:306/5696/636 [pii] 10.1126/science.1105136. - PubMed
    1. Birney E, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. doi:10.1038/nature05874. - PMC - PubMed
    1. Kapranov P, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–1488. doi:1138341 [pii] 10.1126/science.1138341. - PubMed
    1. Kapranov P, Willingham AT, Gingeras TR. Genome-wide transcription and the implications for genomic organization. Nat Rev Genet. 2007;8:413–423. doi:nrg2083 [pii] 10.1038/nrg2083. - PubMed

Publication types

MeSH terms

Associated data