Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Mar;24(3):178-196.
doi: 10.1038/s41576-022-00546-w. Epub 2022 Nov 8.

Probing the dynamic RNA structurome and its functions

Affiliations
Review

Probing the dynamic RNA structurome and its functions

Robert C Spitale et al. Nat Rev Genet. 2023 Mar.

Abstract

RNA is a key regulator of almost every cellular process, and the structures adopted by RNA molecules are thought to be central to their functions. The recent fast-paced evolution of high-throughput sequencing-based RNA structure mapping methods has enabled the rapid in vivo structural interrogation of entire cellular transcriptomes. Collectively, these studies are shedding new light on the long underestimated complexity of the structural organization of the transcriptome - the RNA structurome. Moreover, recent analyses are challenging the view that the RNA structurome is a static entity by revealing how RNA molecules establish intricate networks of alternative intramolecular and intermolecular interactions and that these ensembles of RNA structures are dynamically regulated to finely tune RNA functions in living cells. This new understanding of how RNA can shape cell phenotypes has important implications for the development of RNA-targeted therapeutic strategies.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Chemical probes for interrogating RNA structures.
a, Targets of different chemical probes on RNA, including dimethyl sulfate (DMS), α-ketoaldehydes (such as Glyoxal and N3-kethoxal), 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC), selective 2′-hydroxyl acylation analysed by primer extension (SHAPE) reagents, hydroxyl radicals and nicotinoyl azide (NAz). Sites of chemical modification by probes measuring the pairing status of nucleobases (circles), the solvent accessibility of RNA residues (stars) and the flexibility of the RNA backbone (pentagons) are marked. b, Psoralen interacts with uridines on opposite strands of an RNA duplex and mediates cross-linking of the two strands upon long-wave UV irradiation (365 nm). Cross-linking can occur both intramolecularly and intermolecularly. c, The reaction of bifunctional acylating compounds, such as trans-bis-isatoic anhydride (TBIA) and spatial 2′-hydroxyl acylation reversible cross-linking (SHARC) reagents, results in cross-links between structurally flexible nucleotides that are spatially proximal to each other. Cross-linking can occur both intramolecularly and intermolecularly. d, Upon long-wave UV irradiation, NHS-diazirine cross-links RNA nucleotides and amino acids (usually lysine) of interacting proteins at the RNA–protein interaction interface.
Fig. 2
Fig. 2. Read out of high-throughput sequencing (HTS)-based RNA structure mapping experiments.
a, In chemical probing experiments, RNA undergoes reverse transcription following treatment with the chemical probe. When drop-off-based read-outs are used, the reverse transcriptase (RT) drops off the template at sites that have reacted with the probe, resulting in a pool of truncated cDNA molecules that terminate at the nucleotide prior to the modified site. Alternatively, in mutational profiling (MaP) experiments, reverse transcription conditions are adjusted so that the RT reads through the chemically modified sites but incorporates incorrect bases, resulting in (possibly full-length) cDNAs containing mutations at modification sites. In both cases, cDNA fragments are ligated to adapters, converted to double-stranded DNA libraries and sequenced. Sequencing reads (corresponding to cDNA fragments) are mapped back to the reference transcriptome. For RT drop-off-based experiments, each position i along the RNA is assigned a count corresponding to the number of reads whose 5′ ends mapped one nucleotide downstream (i + 1). For MaP-based experiments, the mutation frequency at each position of the RNA is calculated as the ratio between the number of mutated reads and the total number of reads covering that position. These raw reactivity profiles are then normalized to yield reactivities ranging between 0 (unreactive) and, depending on the normalization method, ≥1 (highly reactive). b, In direct RNA–RNA interaction capture experiments, RNA duplexes are cross-linked (for example by psoralen), RNA is fragmented and the two strands of the cross-linked duplexes are intramolecularly ligated, after which cross-linking is reversed. These chimeric RNA fragments are then reverse-transcribed and the resulting cDNA fragments are ligated to adapters, converted to double-stranded DNA libraries and sequenced. Sequencing reads are then mapped back to the reference transcriptome. As these reads are derived from RNA chimeras, the two halves of these reads will map to distinct locations of the same transcript in the case of intramolecular duplexes, or distinct transcripts in the case of intermolecular duplexes. Figure 2 is adapted from ref., CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).
Fig. 3
Fig. 3. Long-range intramolecular and intermolecular RNA–RNA interactions.
a, The SARS-CoV-2 genome establishes a wide range of mutually exclusive long-range interactions, many of which involve the untranslated regions (UTRs). Four possible structural configurations, observed to coexist in the context of infected host cells, are depicted (from top-left proceeding clockwise): the linear genome; the partially circularized genome owing to an interaction between ORF1a and the 3′ UTR; the partially circularized genome owing to an interaction between ORF1a and the 5′ UTR; and the fully circularized genome owing to an interaction between the 5′ and 3′ UTRs. It is unknown what different functions these conformations play, nor which of the conformations can mutually convert one into another (represented by question marks over arrows). Regions coloured in red can form alternative, mutually exclusive, short-range and long-range RNA–RNA interactions. b, In human cells, the orphan C/D-box small nucleolar RNA (snoRNA) SNORD83B forms intermolecular interactions with the SRSF3, RPS5 and NOP14 mRNAs. The functional relevance of these interactions, which have been shown to modulate the steady-state levels of these mRNAs, is still unknown. c, The ZIKV genome can circularize owing to a long-range interaction between the 5′ and 3′ cyclization sequences (CSs) located at the termini of the genome. Genome cyclization promotes viral replication, whilst hampering translation. In its linear form, the 5′ CS region of the genome has been reported to establish an intermolecular interaction with the host hsa-miR-21 microRNA (in complex with AGO2). Although the mechanistic details of this interaction are still unknown, depletion of hsa-miR-21 potently reduces the cellular levels of viral RNA. BSL, bulged stem-loop; cHP, capsid hairpin; DAR, downstream of AUG region; DCS-PK, downstream of 5′ CS pseudoknot; HVR, hypervariable region; s2m, stem-loop II-like motif; SL, stem-loop; UAR, upstream of AUG region.
Fig. 4
Fig. 4. Determinants of RNA structure heterogeneity in the cell.
Under cellular conditions, the folding landscape of an RNA molecule is constantly changing and RNA molecules can undergo numerous structural rearrangements (Box 1). RNA molecules fold as they get transcribed, and the structures they adopt will change as transcription proceeds. Co-transcriptional events, such as the deposition of RNA post-transcriptional modifications (PTMs) or alternative splicing, can affect varying proportions of the RNA molecules and result in structurally diverse subpopulations. Differential binding of RNA binding proteins (RBPs) can further lead to substantial structural heterogeneity within and across cellular compartments. In the cytoplasm, translation (which itself can be regulated by RNA structure) can also shape the structure of RNA molecules because of the intrinsic helicase activity of the ribosome. Alternative RNA structures are coloured red. These alternative conformations may coexist in the cell, resulting in a heterogeneous ensemble.
Fig. 5
Fig. 5. Experimental and computational methods for RNA ensemble deconvolution.
a, Assays such as co-transcriptional selective 2′-hydroxyl acylation analysed by primer extension (SHAPE) followed by sequencing (SHAPE-seq) and structural probing of elongating transcripts followed by sequencing (SPET-seq) allow RNA co-transcriptional structure folding pathways to be deconvolved by first probing the entire population of transcription intermediates, followed by the computational reconstruction of the individual reactivity profiles. Plotting these reactivity profiles in the form of a heatmap, with the rows corresponding to distinct transcription intermediates sorted by increasing length, provides intuitive visualization of RNA structural rearrangements occurring as transcription proceeds (top to bottom). The example shows two transcription intermediates, each represented by the rows denoted in yellow. During the transition from the first to the second intermediate, the reactivity of the unpaired regions (coloured purple and green on the structures) progressively drops (purple and green boxes on the heatmap) as they begin to undergo base-pairing, resulting in a pseudoknot (purple region) and a stem-loop (SL) (green region). b, Mutate and map (M2) provides an indirect way to deconvolve RNA structure ensembles by randomly generating a large number of single-nucleotide substitution mutants of an RNA of interest, followed by structure probing analysis. Mutations capable of disrupting base-pairing interactions in the wild-type structure, whilst stabilizing alternative folds, will cause a redistribution of the relative abundance of the structures within the ensemble, leading to reactivity changes. The reactivity profiles of these mutants can then be used to infer the structure of these alternative conformations. c, The first group of computational methods for ensemble deconvolution exploits thermodynamics-guided RNA structure prediction software to sample a large number of structures from the theoretical ensemble the RNA of interest can form, and then uses the experimental data to select the smallest possible subset of structures that can explain the data. Typically, structures are then clustered together by similarity and a single representative structure is returned for each cluster. This class of approaches is suitable for the analysis of both reverse transcriptase (RT)-stop and mutational profiling (MaP) RNA structure probing data. d, The second group of computational methods for ensemble deconvolution involves direct read clustering. These methods take sequencing reads from MaP experiments and attempt to define clusters of reads with correlated patterns of mutations, corresponding to alternative RNA conformations. Clustered reads can be processed into reactivity profiles that can then be used to inform structure modelling.
Fig. 6
Fig. 6. RNA structure ensembles identified in high-throughput sequencing (HTS)-based structure probing studies.
a, The structure ensemble of the HIV-1 Rev response element (RRE) populates two conformations, a four-way junction (the minor conformation) and a five-way junction (the major conformation); regions that adopt alternative structures in these two conformations are coloured red. The major conformation can interact with the viral protein, Rev, which promotes nuclear export of the viral genome. This export is crucial both for the translation of the Gag and Gag-Pol proteins and for the packaging of new virions. b, Splicing of the transcript encoding the transactivator protein Tat of HIV-1 is controlled by a switch between two alternative conformations, with consequences for transcription of the HIV-1 genome. In the minor conformation, the A3 splice site is inaccessible to binding by the U2AF splicing factor and, as a result, no functional Tat protein is produced. In the absence of Tat, transcription of the HIV-1 double-stranded DNA genome by the host RNA Polymerase II is highly inefficient. By contrast, the A3 splice site of the Tat transcript is accessible in the major conformation leading to productive splicing, and the resulting Tat protein promotes efficient transcription of the HIV-1 genome. c, In human cells, the activity of P-TEFb, a positive regulator of transcription, is controlled by the 7SK snRNA, which is capable of binding and sequestering P-TEFb. The structure ensemble of 7SK populates two major conformations: one that contains the SL1 stem-loop, which can bind to and sequester P-TEFb (P-TEFb-bound); and one that contains the SL1alt stem-loop and cannot sequester P-TEFb (P-TEFb-unbound). Thus, switching between SL1 and SL1alt stem-loop containing-structures regulates the binding of P-TEFb and, thereby, its availability for promoting transcription. A third highly dynamic minor conformation of 7SK has also been identified and hypothesized to represent an intermediate state between the two major conformations. Arrows with questions marks above indicate that it is not yet known whether those conformations can interconvert. This highly dynamic intermediate is possibly an average of multiple low-abundance conformations. Part c is adapted with permission from ref., Elsevier.
Fig. 7
Fig. 7. Challenges in high-throughput sequencing (HTS)-based RNA structure mapping studies.
a, Mapping of pseudoknots can potentially be achieved by combining direct RNA–RNA interaction capture with methods for ensemble deconvolution from chemical probing experiments. Although RNA duplex mapping does not preserve any information about the relationship between two independent helices, using ensemble deconvolution analysis to determine whether the region of the RNA encompassing these helices populates one or two conformations can help determine whether two incompatible helices coexist within the same RNA molecule, forming a pseudoknot, or whether they belong to two independent RNA molecules. b, Specialized structure probing assays can aid the analysis of RNA structure ensembles in vivo. Coupling of chemical probing with single-cell analysis (top), RNA immunoprecipitation (middle) or polysome fractionation (bottom) would increase the resolution of RNA structure analyses, possibly enabling the characterization of lowly abundant RNA conformations. c, RNA chemical probing can aid the mapping of small molecule–RNA interactions. Analysis of population-averaged reactivities can be used to identify footprints of small molecules binding to RNA. The coupling of chemical probing with ensemble deconvolution analysis can further help elucidate binding modes of small molecules, possibly enabling the identification of specific RNA conformations targeted by the small molecule.

References

    1. Leppek K, Das R, Barna M. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell Biol. 2018;19:158–174. doi: 10.1038/nrm.2017.103. - DOI - PMC - PubMed
    1. Mayr C. Regulation by 3′-untranslated regions. Annu. Rev. Genet. 2017;51:171–194. doi: 10.1146/annurev-genet-120116-024704. - DOI - PubMed
    1. Frankish A, et al. GENCODE 2021. Nucleic Acids Res. 2021;49:D916–D923. doi: 10.1093/nar/gkaa1087. - DOI - PMC - PubMed
    1. Fu X-D. Non-coding RNA: a new frontier in regulatory biology. Natl Sci. Rev. 2014;1:190–204. doi: 10.1093/nsr/nwu008. - DOI - PMC - PubMed
    1. Mustoe AM, Brooks CL, Al-Hashimi HM. Hierarchy of RNA functional dynamics. Annu. Rev. Biochem. 2014;83:441–466. doi: 10.1146/annurev-biochem-060713-035524. - DOI - PMC - PubMed

Publication types

MeSH terms