Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;15(10):491.
doi: 10.1186/s13059-014-0491-2.

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Affiliations

Genome-wide profiling of mouse RNA secondary structures reveals key features of the mammalian transcriptome

Danny Incarnato et al. Genome Biol. 2014.

Abstract

Background: The understanding of RNA structure is a key feature toward the comprehension of RNA functions and mechanisms of action. In particular, non-coding RNAs are thought to exert their functions by specific secondary structures, but an efficient annotation on a large scale of these structures is still missing.

Results: By using a novel high-throughput method, named chemical inference of RNA structures, CIRS-seq, that uses dimethyl sulfate, and N-cyclohexyl- N'-(2-morpholinoethyl) carbodiimide metho-p-toluenesulfonate to modify RNA residues in single-stranded conformation within native deproteinized RNA secondary structures, we investigate the structural features of mouse embryonic stem cell transcripts. Our analysis reveals an unexpected higher structuring of the 5′ and 3′ untranslated regions compared to the coding regions, a reduced structuring at the Kozak sequence and stop codon, and a three-nucleotide periodicity across the coding region of messenger RNAs. We also observe that ncRNAs exhibit a higher degree of structuring with respect to protein coding transcripts. Moreover, we find that the Lin28a binding protein binds selectively to RNA motifs with a strong preference toward a single stranded conformation.

Conclusions: This work defines for the first time the complete RNA structurome of mouse embryonic stem cells,revealing an extremely distinct RNA structural landscape. These results demonstrate that CIRS-seq constitutes an important tool for the identification of native deproteinized RNA structures.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the CIRS-seq method. Cells are harvested and lysed in isotonic buffer, then treated with Proteinase K to unmask protein-bound regions of RNAs. The whole cell population of RNAs in their native deproteinized conformation is probed with either DMS or CMCT to modify unpaired bases. A non-treated control is also produced to allow further mapping of natural RT stops. After modification, the RNAs from the three populations are reverse transcribed, and cDNA is adapter ligated for high-throughput sequencing. Mapping reads to the transcriptome provide information regarding how many RT stops occurred at each position of the analyzed transcripts. The non-treated (NT) signal at each position is then subtracted from the DMS and CMCT signals to obtain the raw reactivity profile at base resolution. After scaling each data point above the 90th percentile to the 90th percentile, reactivity at each position is divided by the 90th percentile (90% Winsorising) to obtain the normalized reactivity.
Figure 2
Figure 2
Validation of CIRS-seq data. (a) Distribution of transcripts with at least one RT stop on average per base. (b) Scatter plot of normalized reactivities in the two biological replicates of CIRS-seq. Reactivities are averaged in 10-nucleotide windows, with an offset of 5 nucleotides (Pearson’s correlation coefficient = 0.90. (c) Normalized reactivity profiles for the glutamic acid tRNA and overlay of reactivity data on the phylogenetically derived secondary structure. Yellow arrows indicate highly reactive positions (reactivity >0.7). Bases are color coded according to their reactivity. (c) Normalized reactivity profiles for the U5 snRNA and overlay of reactivity data on phylogenetically derived secondary structure. The structure of the U5 human homolog is also shown, with superimposed DMS/CMCT-reactive positions from [55]. The colors correspond to different degrees of chemical modification (purple, strong; yellow, medium; green, weak). Yellow arrows indicate highly reactive positions (reactivity >0.7). Bases are color coded according to their reactivity.
Figure 3
Figure 3
CIRS-seq data allow correct inference of native deproteinized RNA secondary structures. (a) Normalized reactivity profiles for the U2 snRNA and overlay of reactivity data on the secondary structure inferred from chemical constraints. Bases are color coded according to their reactivity. The structure of the human ortholog with superimposed SHAPE-reactive positions from [57], and the unconstrained MFE structure are also shown. (b) Normalized reactivity profiles for the low-abundance U12 snRNA and overlay of reactivity data on the secondary structure inferred from chemical constraints. Bases are color coded according to their reactivity. The structure of the U12 A. thaliana ortholog with superimposed DMS/SHAPE-reactive positions from [58], and the unconstrained MFE structure are also shown.
Figure 4
Figure 4
Transcriptome-wide analysis of mRNAs reveals structural features of protein-coding and non-coding transcripts. (a) Meta-gene analysis across the last 50 nucleotides of the 5′ UTR, the first and last 100 nucleotides of the coding region, and the first 50 nucleotides of the 3′ UTR of approximately 9,500 mRNAs. (b) Average reactivity of the 5′ UTR, coding region, and 3′ UTR. (c) Average reactivity on the Kozak sequence (−6/+1 nucleotides around AUG), coding region, and stop codon (+3 nucleotides upstream). (d) Average reactivity for the first, second, and third base of each coding sequence codon, and for the first, second, and third base of the 5′ UTR and 3′ UTR, respectively, in the first and last 99 nucleotides of the coding region, last 48 nucleotides of the 5′ UTR, and first 48 nucleotides of the 3′ UTR. (e) Box-plot of base-normalized average CIRS-seq reactivities for protein-coding and non-coding RNAs, calculated on all transcript positions with sequencing depth >50 × .
Figure 5
Figure 5
CIRS-seq reveals structural preferences of RNA binding proteins. (a) Average reactivity across 300 nucleotides surrounding summits of Lin28a peaks. (b) Representation of sample secondary structures for Lin28a binding sites. Bases are color coded according to their reactivity. The purine-rich motifs are highlighted in green.

References

    1. ENCODE Project Consortium. Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. - DOI - PMC - PubMed
    1. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, et al. Landscape of transcription in human cells. Nature. 2012;489:101–108. doi: 10.1038/nature11233. - DOI - PMC - PubMed
    1. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R. The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res. 2012;22:1775–1789. doi: 10.1101/gr.132159.111. - DOI - PMC - PubMed
    1. Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I, Meissner A, Regev A, Rinn JL, Root DE, Lander ES. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011;477:295–300. doi: 10.1038/nature10398. - DOI - PMC - PubMed
    1. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. doi: 10.1101/gad.17446611. - DOI - PMC - PubMed

Publication types

Associated data