Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May 19;165(5):1267-1279.
doi: 10.1016/j.cell.2016.04.028. Epub 2016 May 12.

RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure

Affiliations

RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure

Zhipeng Lu et al. Cell. .

Abstract

RNA has the intrinsic property to base pair, forming complex structures fundamental to its diverse functions. Here, we develop PARIS, a method based on reversible psoralen crosslinking for global mapping of RNA duplexes with near base-pair resolution in living cells. PARIS analysis in three human and mouse cell types reveals frequent long-range structures, higher-order architectures, and RNA-RNA interactions in trans across the transcriptome. PARIS determines base-pairing interactions on an individual-molecule level, revealing pervasive alternative conformations. We used PARIS-determined helices to guide phylogenetic analysis of RNA structures and discovered conserved long-range and alternative structures. XIST, a long noncoding RNA (lncRNA) essential for X chromosome inactivation, folds into evolutionarily conserved RNA structural domains that span many kilobases. XIST A-repeat forms complex inter-repeat duplexes that nucleate higher-order assembly of the key epigenetic silencing protein SPEN. PARIS is a generally applicable and versatile method that provides novel insights into the RNA structurome and interactome. VIDEO ABSTRACT.

PubMed Disclaimer

Figures

Figure 1
Figure 1. PARIS identifies RNA helices and interactions in living cells
(A) Schematic diagram of PARIS with three critical steps: in vivo AMT crosslinking, 2D gel purification and proximity library. The blue line is AMT. The dashed lines indicate ligations. Note that the ligation could happen on either ends, resulting in normal gapped or chiastic reads. (B) 2D purification of the crosslinked RNA. The blue box indicates the region that contain crosslinked RNA. Percentage of recovery of crosslinked RNA from total RNA is indicated in parentheses. See Figure S1 for the high RNase digestion 2D gel. (C) PARIS sequenced reads are highly reproducible between the high RNase and low RNase conditions in HeLa cells. (D) Comparison of known structures (black arcs) and interactions (blue arcs) of the U4 and U6 snRNAs to PARIS DGs. Ten reads are shown for each DG. Dashed box highlights DG2 (see E-G). DG2 and DG4: U4 stem-loops . DG5 and DG6: U6 stem-loops. DG1 and DG3: U4:U6 interaction. (E) An example duplex group (DG2) in U4 snRNA and the definition of terms (DG, arm, gap/loop and span) used in this paper. Note that the staggered termini for the two arms indicate that these reads come from distinct RNase cleavage sites from individual RNA molecules, i.e. each gapped read is an individual molecule measurement of a stem-loop or an RNA-RNA interaction duplex. (F-G) The structure model of the duplex group (DG2) is consistent with known base pairs from the crystal structure of U4. Dashes are the gaps. (H-I) PARIS identifies the stem-loop structure in the low-abundance snRNA U7 (H) and MIR10A precursor (I). (J) PARIS identifies known structures in telomerase RNA (TERC). The boxes indicate interlocking DGs corresponding to the P2/P3 pseudoknot. See also Figure S1, S2 and Table S1.
Figure 2
Figure 2. Global properties of RNA structures in living cells
(A) Size distribution of RNA structures. One replicate from each cell type is shown here. Genomic span is the distance between the ends of gapped reads in the genome, while the transcriptomic span excludes introns. (B) Metagene distribution of PARIS determined helices among exons. Only the first three and last three exons were plotted. One biological replicate is plotted for each cell type. The gradation of green color correlates to number of DGs in log scale. (C,D) Example higher order architecture of human RPS4X mRNA (C). The blue boxed region is zoomed in to highlight DGs connecting different parts of the mRNA (D). See also Figure S3, S4
Figure 3
Figure 3. PARIS guides global phylogenetic analysis of RNA structures
(A) Two approaches of PARIS-guided phylogenetic analysis of RNA structures. The numbers of structures are in parentheses. (B) Scatterplot of z-scores and covariation energies for the structure-based analysis of conservation in amniotes. All 16606 structures with Z-score < −2.326 (p<0.01) were plotted. (C) Distribution of the linear span of the conserved structures identified by the two methods. (D) Evolutionarily conserved structures in RPL8 mRNA using direct comparison of human (HEK293) and mouse (ES cells) PARIS data. An example conserved long-range structure, connecting the third and sixth exons in human and mouse is supported by both icSHAPE and phylogenetic analysis in multiz100 multiple genome alignments. Significance of the overlap was tested by random shuffling of DGs in the exons. In this structure, 6.5% of all potential base pairs are one- or two- sided covariants (E). Four more examples of conserved mRNA architectures between human and mouse. See also Table S2, S3 and S4.
Figure 4
Figure 4. PARIS reveals pervasive alternative structures
(A) Diagram of alternative structures. (B-C) PARIS identifies alternative structure/interactions in the U4:U6 snRNA heterodimer (B). Two alternative structures are shown here: DG1 vs. DG2 and DG1 vs. DG3 (C). (D) An example of extensive alternative structures in the 3’UTR of TUBB mRNA from HeLa PARIS data. Only DGs involved in this cluster of alternative structures are shown. First track: PARIS-based structure models. The corresponding structure models and DGs are color-coded. (E) The hub of the alternative structures. The five alternative structures are displayed in dot-bracket format and color-matched to panel D. Nucleotides involved in conflicts are highlighted and underlined. (F) Fraction of DGs involved in alternative structures that comprise 1, 2, or at least three pairs of alternative structures are plotted as a fraction of all DGs. Top 50 mRNAs were used for each of the 3 panels. One replicate was plotted for each cell type. HeLa_LowRNase: 744 out of 3801 DGs (20%) are involved in alternative structures (711 pairs) and 31 pairs of alternative structures (4.4%) are supported by conservation/covariation (both structures in each pair). HEK293_1: 459 out of 1338 DGs (34%) involved in alternative structures (448 pairs) and 7 pairs of alternative structures (1.6%) are supported by conservation/covariation. mES_1: 592 out of 1291 DGs (46%) involved in alternative structures. (G) An example alternative structure in RPL8 mRNA supported by both human and mouse icSHAPE and PARIS data. The structure models show the perfect correspondence between the icSHAPE data and base pairs (gray shaded area). See also Table S5 and Figure S5, S7
Figure 5
Figure 5. PARIS determines new RNA:RNA interactions with high resolution
(A) Models of H/ACA box sno/scaRNA guided RNA pseudouridylation and C/D box sno/scaRNA guided 2’-O-methylation. Ψ: pseudouridine. 2’-O-Me: 2’-O-methyl. (B) Specificity and resolution of the snoRNA-guided modification of human ribosomal RNAs. For each known snoRNA:rRNA interaction, the number of reads were normalized so that the maximum is 1. All identified snoRNA:rRNA interactions from HEK293 cells were averaged. (C-D) Base pairing model from snoRNABase (C) and PARIS data (D) were shown for the SNORD95:28S interaction. The asterisk indicates the known modification site. (E-F) PARIS in human and mouse cells reveals the interaction site on U8 snoRNA (E) and 28S rRNA (F). PARIS-determined interaction sites were marked by the blue box, while the previously reported binding site is shaded gray (Peculis 1997). (G-H) The original U8:rRNA interaction was not supported by phylogenetic conservation and hybridization energy (G), whereas the newly identified U8:rRNA interaction is (H). The consensus sequences were from Rfam. (I) Meta analysis of the U1 target site. The U1:MALAT1 interactions use the 5’ end of the U1 snRNA in both human and mouse cells. (J) U1 snRNA interacts with MALAT1 RNA in human and mouse cells. PARIS achieves higher resolution than RAP (Engreitz 2014). The blue shaded peaks are shared between human/mouse PARIS and RAP data. The red shaded peaks are shared between one of the PARIS datasets and RAP data. Fisher's exact test was used to show the significant overlap between human and mouse PARIS-determined U1 sites. (K) Example gapped reads for a conserved U1:MALAT1 interaction. The 5’ end of the U1 snRNA interacts with MALAT1 (at nt position ~5400) See also Figure S6.
Figure 6
Figure 6. Integrated structure analysis of the human XIST RNA
(A) Overview of XIST lncRNA. Xist exons and repeat, phylogenetic conservation (PhyloP), icSHAPE, and PARIS data in HEK293 cells are shown. (B) Architecture of the XIST RNA. Each point in the triangular heatmap shows the PARIS connection between the two regions indicated by the feet of the triangle. Data are plotted in 100nt × 100nt bins. Each RNA duplex detected by PARIS are plotted below. The duplex loops are clustered into four major RNA structure domains. The repeat A region is a small domain before the domain 1. (C) Conservation of RNA duplexes determined by phylogenetic analysis of eutherian XIST homologs. Conserved helices (p-value<0.01) are plotted. (D) An example long range (~7kb gap) structure with PARIS, icSHAPE, and phylogenetic support (9.4% of all base pairs are one- or two-sided covariants). (E) Integrated structure analysis of the conserved repeats in the A-repeat region. Conservation track: phyloP score for the eutherian alignments. PARIS coverage is shown in log scale. All detected inter-repeat and are illustrated in the arcs of structure models. A1-A8, repeats. Numbers in parentheses are the numbers of reads in each DG. The non-conserved repeat-spacer DGs (lower part) were shown separately from the conserved ones (upper part). (F) Consensus model of the A-repeat inter-repeat structure. The consensus model depicts two repeats base-paired to each other. The red highlighted regions indicate the conserved repeats, while the non-highlighted regions indicate the spacers. Non-canonical: non-Watson-Crick base pairs with intermediate icSHAPE reactivity (constrained by the surrounding base pairs). Conservation and icSHAPE: average for all 8 repeats. Mouse Xist in vitro SHAPE is similar to the HEK293 XIST icSHAPE. SPEN is crosslinked to 3-5nt upstream of the inter-repeat duplex (see Figure 7 for SPEN iCLIP). See also Figure S7 and Table S6
Figure 7
Figure 7. The A-repeat structure promotes SPEN binding and higher order RNP formation
(A) In vitro iCLIP with human SPEN RRM2-4 and IRES-GFP or mouse repA RNA. The diagram shows the domain organization of SPEN. The autoradiograph shows one iCLIP experiment. The entire A-repeat region is 1630nt. The IRES-GFP RNA is 1533nt. The dimer band relative intensity is 1 for the repA RNA and 0.61 for the GFP RNA control. See Figure S7 for another replicate of the iCLIP experiment. (B) All the 6 iCLIP tracks are normalized by total read count and scaled to 0-2300. (C) For each of the four SPEN+repA iCLIP tracks, the crosslinking frequency for top 5% of crosslinked nucleotides were extracted from the repeats region and the outside region. This analysis shows that SPEN binds the repeats region more than the outside region. (D) Nucleotides with the top 5% and bottom 5% of iCLIP signal were extracted from each of the 4 tracks, and then the icSHAPE signals were compared. This analysis shows that SPEN RRM2-4 are preferentially crosslinked to single-stranded regions (high icSHAPE signal). (E) Model of SPEN-repA association. The base pairing among the repeats are stochastic and only one specific conformation is shown here. SPEN binding requires both single-stranded and double-stranded regions, but is only crosslinked to the single stranded nucleotides 3-5nt upstream of the inter-repeat duplex. See also Figure S7.

Comment in

References

    1. Almada AE, Wu X, Kriz AJ, Burge CB, Sharp PA. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature. 2013;499:360–363. - PMC - PubMed
    1. Arieti F, Gabus C, Tambalo M, Huet T, Round A, Thore S. The crystal structure of the Split End protein SHARP adds a new layer of complexity to proteins containing RNA recognition motifs. Nucleic acids research. 2014;42:6742–6752. - PMC - PubMed
    1. Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D. Ultraconserved elements in the human genome. Science. 2004;304:1321–1325. - PubMed
    1. Brown CJ, Ballabio A, Rupert JL, Lafreniere RG, Grompe M, Tonlorenzi R, Willard HF. A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature. 1991;349:38–44. - PubMed
    1. Calvet JP, Pederson T. Heterogeneous nuclear RNA double-stranded regions probed in living HeLa cells by crosslinking with the psoralen derivative aminomethyltrioxsalen. Proceedings of the National Academy of Sciences of the United States of America. 1979;76:755–759. - PMC - PubMed

Publication types