. 2014;15(10):491.

doi: 10.1186/s13059-014-0491-2.

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Danny Incarnato¹, Francesco Neri, Francesca Anselmi, Salvatore Oliviero

Affiliations

PMID: 25323333
PMCID: PMC4220049
DOI: 10.1186/s13059-014-0491-2

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Danny Incarnato et al. Genome Biol. 2014.

. 2014;15(10):491.

doi: 10.1186/s13059-014-0491-2.

Authors

Danny Incarnato¹, Francesco Neri, Francesca Anselmi, Salvatore Oliviero

Affiliation

¹ Human Genetics Foundation (HuGeF), via Nizza 52, Torino 10126, Italy.

PMID: 25323333
PMCID: PMC4220049
DOI: 10.1186/s13059-014-0491-2

Abstract

Background: The understanding of RNA structure is a key feature toward the comprehension of RNA functions and mechanisms of action. In particular, non-coding RNAs are thought to exert their functions by specific secondary structures, but an efficient annotation on a large scale of these structures is still missing.

Results: By using a novel high-throughput method, named chemical inference of RNA structures, CIRS-seq, that uses dimethyl sulfate, and N-cyclohexyl- N'-(2-morpholinoethyl) carbodiimide metho-p-toluenesulfonate to modify RNA residues in single-stranded conformation within native deproteinized RNA secondary structures, we investigate the structural features of mouse embryonic stem cell transcripts. Our analysis reveals an unexpected higher structuring of the 5′ and 3′ untranslated regions compared to the coding regions, a reduced structuring at the Kozak sequence and stop codon, and a three-nucleotide periodicity across the coding region of messenger RNAs. We also observe that ncRNAs exhibit a higher degree of structuring with respect to protein coding transcripts. Moreover, we find that the Lin28a binding protein binds selectively to RNA motifs with a strong preference toward a single stranded conformation.

Conclusions: This work defines for the first time the complete RNA structurome of mouse embryonic stem cells,revealing an extremely distinct RNA structural landscape. These results demonstrate that CIRS-seq constitutes an important tool for the identification of native deproteinized RNA structures.

PubMed Disclaimer

Figures

**Figure 1**
**Overview of the CIRS-seq method.** Cells are harvested and lysed in isotonic buffer, then treated with Proteinase K to unmask protein-bound regions of RNAs. The whole cell population of RNAs in their native deproteinized conformation is probed with either DMS or CMCT to modify unpaired bases. A non-treated control is also produced to allow further mapping of natural RT stops. After modification, the RNAs from the three populations are reverse transcribed, and cDNA is adapter ligated for high-throughput sequencing. Mapping reads to the transcriptome provide information regarding how many RT stops occurred at each position of the analyzed transcripts. The non-treated (NT) signal at each position is then subtracted from the DMS and CMCT signals to obtain the raw reactivity profile at base resolution. After scaling each data point above the 90th percentile to the 90th percentile, reactivity at each position is divided by the 90th percentile (90% Winsorising) to obtain the normalized reactivity.

**Figure 2**
**Validation of CIRS-seq data. (a)** Distribution of transcripts with at least one RT stop on average per base. **(b)** Scatter plot of normalized reactivities in the two biological replicates of CIRS-seq. Reactivities are averaged in 10-nucleotide windows, with an offset of 5 nucleotides (Pearson’s correlation coefficient = 0.90. **(c)** Normalized reactivity profiles for the glutamic acid tRNA and overlay of reactivity data on the phylogenetically derived secondary structure. Yellow arrows indicate highly reactive positions (reactivity >0.7). Bases are color coded according to their reactivity. **(c)** Normalized reactivity profiles for the U5 snRNA and overlay of reactivity data on phylogenetically derived secondary structure. The structure of the U5 human homolog is also shown, with superimposed DMS/CMCT-reactive positions from [55]. The colors correspond to different degrees of chemical modification (purple, strong; yellow, medium; green, weak). Yellow arrows indicate highly reactive positions (reactivity >0.7). Bases are color coded according to their reactivity.

**Figure 3**
**CIRS-seq data allow correct inference of native deproteinized RNA secondary structures. (a)** Normalized reactivity profiles for the U2 snRNA and overlay of reactivity data on the secondary structure inferred from chemical constraints. Bases are color coded according to their reactivity. The structure of the human ortholog with superimposed SHAPE-reactive positions from [57], and the unconstrained MFE structure are also shown. **(b)** Normalized reactivity profiles for the low-abundance U12 snRNA and overlay of reactivity data on the secondary structure inferred from chemical constraints. Bases are color coded according to their reactivity. The structure of the U12 *A. thaliana* ortholog with superimposed DMS/SHAPE-reactive positions from [58], and the unconstrained MFE structure are also shown.

**Figure 4**
**Transcriptome-wide analysis of mRNAs reveals structural features of protein-coding and non-coding transcripts. (a)** Meta-gene analysis across the last 50 nucleotides of the 5′ UTR, the first and last 100 nucleotides of the coding region, and the first 50 nucleotides of the 3′ UTR of approximately 9,500 mRNAs. **(b)** Average reactivity of the 5′ UTR, coding region, and 3′ UTR. **(c)** Average reactivity on the Kozak sequence (−6/+1 nucleotides around AUG), coding region, and stop codon (+3 nucleotides upstream). **(d)** Average reactivity for the first, second, and third base of each coding sequence codon, and for the first, second, and third base of the 5′ UTR and 3′ UTR, respectively, in the first and last 99 nucleotides of the coding region, last 48 nucleotides of the 5′ UTR, and first 48 nucleotides of the 3′ UTR. **(e)** Box-plot of base-normalized average CIRS-seq reactivities for protein-coding and non-coding RNAs, calculated on all transcript positions with sequencing depth >50 × .

See this image and copyright information in PMC

References

1. ENCODE Project Consortium. Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. - DOI - PMC - PubMed
1. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, et al. Landscape of transcription in human cells. Nature. 2012;489:101–108. doi: 10.1038/nature11233. - DOI - PMC - PubMed
1. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R. The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res. 2012;22:1775–1789. doi: 10.1101/gr.132159.111. - DOI - PMC - PubMed
1. Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I, Meissner A, Regev A, Rinn JL, Root DE, Lander ES. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011;477:295–300. doi: 10.1038/nature10398. - DOI - PMC - PubMed
1. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. doi: 10.1101/gad.17446611. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Associated data

Actions
- Search in PubMed
- Search in GEO

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Affiliation

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Associated data

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials