. 2012 Sep 6;489(7414):101-8.

doi: 10.1038/nature11233.

Landscape of transcription in human cells

Sarah Djebali¹, Carrie A Davis, Angelika Merkel, Alex Dobin, Timo Lassmann, Ali Mortazavi, Andrea Tanzer, Julien Lagarde, Wei Lin, Felix Schlesinger, Chenghai Xue, Georgi K Marinov, Jainab Khatun, Brian A Williams, Chris Zaleski, Joel Rozowsky, Maik Röder, Felix Kokocinski, Rehab F Abdelhamid, Tyler Alioto, Igor Antoshechkin, Michael T Baer, Nadav S Bar, Philippe Batut, Kimberly Bell, Ian Bell, Sudipto Chakrabortty, Xian Chen, Jacqueline Chrast, Joao Curado, Thomas Derrien, Jorg Drenkow, Erica Dumais, Jacqueline Dumais, Radha Duttagupta, Emilie Falconnet, Meagan Fastuca, Kata Fejes-Toth, Pedro Ferreira, Sylvain Foissac, Melissa J Fullwood, Hui Gao, David Gonzalez, Assaf Gordon, Harsha Gunawardena, Cedric Howald, Sonali Jha, Rory Johnson, Philipp Kapranov, Brandon King, Colin Kingswood, Oscar J Luo, Eddie Park, Kimberly Persaud, Jonathan B Preall, Paolo Ribeca, Brian Risk, Daniel Robyr, Michael Sammeth, Lorian Schaffer, Lei-Hoon See, Atif Shahab, Jorgen Skancke, Ana Maria Suzuki, Hazuki Takahashi, Hagen Tilgner, Diane Trout, Nathalie Walters, Huaien Wang, John Wrobel, Yanbao Yu, Xiaoan Ruan, Yoshihide Hayashizaki, Jennifer Harrow, Mark Gerstein, Tim Hubbard, Alexandre Reymond, Stylianos E Antonarakis, Gregory Hannon, Morgan C Giddings, Yijun Ruan, Barbara Wold, Piero Carninci, Roderic Guigó, Thomas R Gingeras

Affiliations

PMID: 22955620
PMCID: PMC3684276
DOI: 10.1038/nature11233

Landscape of transcription in human cells

Sarah Djebali et al. Nature. 2012.

. 2012 Sep 6;489(7414):101-8.

doi: 10.1038/nature11233.

Authors

Affiliation

¹ Centre for Genomic Regulation and UPF, Doctor Aiguader 88, Barcelona 08003, Catalonia, Spain.

PMID: 22955620
PMCID: PMC3684276
DOI: 10.1038/nature11233

Abstract

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

PubMed Disclaimer

Figures

**Figure1. A large majority of Gencode elements are detected by RNA-seq data**
Shown are Gencode detected elements in the polyadenylated and non-polyadenylated fractions of cellular compartments (cumulative counts for both RNA fractions and compartments refer to elements present in any of the fractions or compartments). Each box plot is generated from values across all cell lines, thus capturing the dispersion across cell lines. The largest point shows the cumulative value over all cell lines.

**Figure2. Co-transcriptional splicing**
a. Short read mappings for exon-based splicing completion. Read mappings that allow assessment of splicing completion around exons. (*a,b,c*) Reads providing evidence of splicing completion for the region containing the exon (with either exon inclusion, ab, or exclusion, c) (*d,e*) Reads providing evidence for the splicing of the region containing the exon not being completed yet. The complete Splicing Index (*coSI*) is the ratio of *a+b+c* over *a+b+c+d+e* and can thus be broadly assumed to correspond to the fraction of RNA molecules in which the region containing the exon has already been spliced (see Tilgner et al.). A *coSI* value of 1 means splicing completed, while a value of 0 indicates that splicing has not yet been initiated. b. Distribution of coSI scores computed on Gencode internal exons: (Top) Distribution in the total chromatin RNA fraction. (Bottom) Distribution in cytosolic polyadenylated RNA fraction.

**Figure 3. Abundance of gene types in cellular compartments**
2D Kernel density plots of nuclear over cytosolic enrichment (Y axis) versus overall gene expression in the whole cell extract (X axis), for protein coding, long non-coding and novel genes over all cell lines. Only genes present in all 3 RNA extracts are displayed, as well as two representative genes (*ACTG1* in red and *H19* in blue), for which the expression in each individual cell line is shown. The actual values of the estimated Kernel density are indicated by contour lines and color shades.

**Figure 4. Isoform expression within a gene**
a. Number of expressed isoforms per gene per cell line. Genes tends to express many isoforms simultaneously. b. Relative expression of the most abundant isoform per gene per cell line. There is generally one dominant isoform in a given condition.

**Figure 5. Transcription at enhancers**
a. The pattern of RNA elements around enhancer predictions^, containing DNase I hypersensitive (HS) sites. The lines represent the average frequency of RNA elements (top: polyadenylated long RNA contigs; middle: CAGE tag clusters; bottom: non-polyadenylated long RNA contigs) in a genomic window around the center of the enhancer prediction as determined by DNase I HS sites. Elements on the plus strand are shown in red, and on the minus strand in blue. b. Enhancer transcripts differ from promoter transcripts. The box plots compare the features of transcripts at predicted enhancer loci compared to predicted novel intergenic promoters and annotated promoters. H3k4me3, PolyA+ and Nucleus denote the 3 following ratios: H3k4me3/(H3k4me3 + H3k4me1), polyadenylated/(polyadenylated + non-polyadenylated), Nuclear/(Nuclear + Cytosolic). Enhancers are marked by higher levels of H3k4me1 compared to H3K4me3 than novel or annotated promoters (left). Enhancer transcripts show higher levels of non-polyadenylated (middle) and nuclear (right) RNA relative to promoters. c. Chromatin state at transcribed enhancers. Enhancer predictions with evidence of transcription (in blue; Cage tags present at predicted locus) show a different pattern of histone modifications and higher levels of RNA Polymerase II binding than non-transcribed predictions (red). They are enriched for H3K27 acetylation, H3K4 methylation, H3K79 di-methylation and depleted for H3K27 tri-methylation. d. Enhancer activity and transcription is cell type specific. Loci predicted to be active transcribed enhancers in GM12878 cells, show low signal for CAGE tags (top) and for H3K27 acetylation (bottom) in other cell lines.

**Figure 6. Size distribution of intergenic regions**
Novel genes increase the proportion of small intergenic regions; ig/as = intergenic / antisense.

See this image and copyright information in PMC

Comment in

Genomics: users' guide to the human genome.
Skipper M. Skipper M. Nat Rev Genet. 2012 Oct;13(10):678. doi: 10.1038/nrg3329. Epub 2012 Sep 7. Nat Rev Genet. 2012. PMID: 22955793 No abstract available.
Discovering the complexity of the metazoan transcriptome.
Hangauer MJ, Carpenter S, McManus MT. Hangauer MJ, et al. Genome Biol. 2014 Apr 29;15(4):112. doi: 10.1186/gb4172. Genome Biol. 2014. PMID: 25001157 Free PMC article.

References

1. Mattick JS. Long noncoding RNAs in cell and developmental biology. Semin Cell Dev Biol. 2011;22:327. doi:S1084-9521(11)00077-2 [pii] 10.1016/j.semcdb.2011.05.002. - PubMed
1. The ENCODE (ENCyclopedia Of DNA Elements) Project Science. 2004;306:636–640. doi:306/5696/636 [pii] 10.1126/science.1105136. - PubMed
1. Birney E, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. doi:10.1038/nature05874. - PMC - PubMed
1. Kapranov P, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–1488. doi:1138341 [pii] 10.1126/science.1138341. - PubMed
1. Kapranov P, Willingham AT, Gingeras TR. Genome-wide transcription and the implications for genomic organization. Nat Rev Genet. 2007;8:413–423. doi:nrg2083 [pii] 10.1038/nrg2083. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

Associated data

Actions
- Search in PubMed
- Search in GEO
Actions
- Search in PubMed
- Search in GEO
Actions
- Search in PubMed
- Search in GEO
Actions
- Search in PubMed
- Search in GEO
Actions
- Search in PubMed
- Search in GEO

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- The Lens - Patent Citations Database
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Landscape of transcription in human cells

Affiliation

Landscape of transcription in human cells

Authors

Affiliation

Abstract

Figures

Comment in

References

Publication types

MeSH terms

Substances

Associated data

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases