Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 May 18;54(10):2502-2517.
doi: 10.1021/acs.accounts.1c00118. Epub 2021 May 7.

SHAPE Directed Discovery of New Functions in Large RNAs

Affiliations
Review

SHAPE Directed Discovery of New Functions in Large RNAs

Kevin M Weeks. Acc Chem Res. .

Abstract

RNA lies upstream of nearly all biology and functions as the central conduit of information exchange in all cells. RNA molecules encode information both in their primary sequences and in complex structures that form when an RNA folds back on itself. From the time of discovery of mRNA in the late 1950s until quite recently, we had only a rudimentary understanding of RNA structure across vast regions of most messenger and noncoding RNAs. This deficit is now rapidly being addressed, especially by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry, mutational profiling (MaP), and closely related platform technologies that, collectively, create chemical microscopes for RNA. These technologies make it possible to interrogate RNA structure, quantitatively, at nucleotide resolution, and at large scales, for entire mRNAs, noncoding RNAs, and viral RNA genomes. By applying comprehensive structure probing to diverse problems, we and others are showing that control of biological function mediated by RNA structure is ubiquitous across prokaryotic and eukaryotic organisms.Work over the past decade using SHAPE-based analyses has clarified key principles. First, the method of RNA structure probing matters. SHAPE-MaP, with its direct and one-step readout that probes nearly every nucleotide by reaction at the 2'-hydroxyl, gives a more detailed and accurate readout than alternatives. Second, comprehensive chemical probing is essential. Focusing on fragments of large RNAs or using meta-gene or statistical analyses to compensate for sparse data sets misses critical features and often yields structure models with poor predictive power. Finally, every RNA has its own internal structural personality. There are myriad ways in which RNA structure modulates sequence accessibility, protein binding, translation, splice-site choice, phase separation, and other fundamental biological processes. In essentially every instance where we have applied rigorous and quantitative SHAPE technologies to study RNA structure-function interrelationships, new insights regarding biological regulatory mechanisms have emerged. RNA elements with more complex higher-order structures appear more likely to contain high-information-content clefts and pockets that bind small molecules, broadly informing a vigorous field of RNA-targeted drug discovery.The broad implications of this collective work are twofold. First, it is long past time to abandon depiction of large RNAs as simple noodle-like or gently flowing molecules. Instead, we need to emphasize that nearly all RNAs are punctuated with distinctive internal structures, a subset of which modulate function in profound ways. Second, structure probing should be an integral component of any effort that seeks to understand the functional nexuses and biological roles of large RNAs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Architecture and potential internal structures of RNA molecules. Important classes of motifs are shown, a subset of which might occur in any given RNA.
Figure 2.
Figure 2.
Mechanism of RNA SHAPE chemistry. (A) Reaction scheme. (B) Nucleotide conformations that enhance SHAPE reactivity involve (top) general base catalysis by the pyrimidine O2 (or purine N3), pro-S oxygen, or through-space groups and (bottom) stabilization of the 2’-oxyanion via conformations that direct nonbridging oxygen groups away from the 2’-OH group. (C) Correlation between the model-free generalized order parameter, S2, and SHAPE reactivity, exemplified by the U1A-binding RNA element. (D) SHAPE reactivities superimposed on the TPP riboswitch aptamer domain showing (left) absolute SHAPE reactivities and (right) changes in SHAPE reactivities (and thus local nucleotide dynamics) upon ligand binding (gray). (E) Classes of SHAPE reagents.
Figure 3.
Figure 3.
Overview of SHAPE, MaP, and RNA structure modeling. (A) Mutational profiling. RNA is treated with a reagent that reports secondary or tertiary structure; relaxed fidelity reverse transcription records chemical adducts as mutations relative to the original sequence (in red) internally in the cDNA; cDNAs are sequenced; and reads are aligned and used to create reactivity profiles. Data may be interpreted on a per-nucleotide basis or as through-space internucleotide correlations. (B) The ∆GSHAPE pseudo-free energy change relationship that enables SHAPE-directed structure modeling. (C) Per-nucleotide reactivity profile for a domain of the STMV RNA genome,. (D, E) Secondary structure models (based on data in panel C) shown as (D) probability arc plots and (E) minimum free energy secondary structure diagram. In probability arcs and lines connecting base pairs, colors indicate the likelihood of unique pairing for a given nucleotide. (F) Representative secondary structure modeling, showing the SAM-I riboswitch (left) without and (right) with SHAPE data. SHAPE-directed modeling often yields dramatic improvements, including for RNAs containing pseudoknots. Arcs indicate correct (green), incorrect (purple), and missing (red) pairs, relative to accepted structure.
Figure 4.
Figure 4.
Identification of well-determined RNA structures. (box) Equation for Shannon entropy (H) and illustration of overlap of low SHAPE and low Shannon entropy (lowSS) regions. (A) Representative analysis illustrating that functional RNA elements tend to overlap with lowSS regions (blue shading), here corresponding to the 5’-UTR and frameshift elements of HIV-1. (B and C) lowSS analysis of (B) native-like and refolded dengue RNA genomes and (C) human rRNAs. (D) Illustration of the enormous diversity in lowSS regions, extending from the TPP riboswitch (79 nts, 100% lowSS) to the HIV-1 RNA genome (9,173 nts, 40% lowSS), and including bacterial and human rRNAs,. All RNAs shown on the same length scale.
Figure 5.
Figure 5.
Comprehensiveness in understanding RNA structure and RNA-protein interactions. (A) Secondary structure models for CAG repeat-containing sequences from human HTT mRNA based on analysis of (left) local structures and (right) the full-length first exon (360–500 nts, depending on number of CAG repeats). Start codon is boxed in green. (B) Secondary structure models of the HIV-1 RNA frameshift element based on analysis of (left) local structure and (right) the entire HIV-1 genome. (C) Time-resolved folding of a retroviral RNA packaging domain in the absence and presence of nucleocapsid chaperone. Per-nucleotide reactivities are shown on a scale from red (high) to black (low). Nucleotides are grouped by k-means clustering; kinetic profiles are shown for each cluster. (D) SHAPE-detected protein-RNA contacts 7 seconds after protein addition reveals interactions at guanosine (blue). (E) Model for chaperone-mediated facilitation of RNA folding by destabilizing interactions involving guanosine.
Figure 6.
Figure 6.
Cellular environment and RNA structure. (A) SHAPE analysis of the adenine riboswitch aptamer domain under simplified conditions, in cells (the reference state, box), and in the presence of ligand. Higher and lower SHAPE reactivities relative to in-cell RNA are red and blue, respectively. (B) Model for the structure of the 16S rRNA in free 30S subunits in cells. SHAPE reactivity pattern in helices 28 and 44 is incompatible with the structure visualized in high-resolution structures. (C) Movement of helix 44 in the in-cell state, emphasizing a large-scale conformational switch. (D) The ∆SHAPE framework for identifying significant changes between two states. Structure shows ∆SHAPE sites in the human U1 snRNP complex (green spheres) and their proximal proteins. (E) Protein interactions across the mouse Xist lncRNA mapped using large-scale difference analysis. (top) Effects of the in-cell environment categorized by absolute differences in SHAPE reactivity (50-nt sliding window). (middle) Ratio of positive to negative differences is suggestive of protein binding and RNA structural rearrangement. (bottom) Positive and negative ∆SHAPE sites are blue and red, respectively. (F) Effect of translation on RNA structure in E. coli cells. SHAPE reactivities increase, relative to the cell-free state, specifically in highly translated genes; kasugamycin treatment partially abrogates this increase.
Figure 7.
Figure 7.
“Structural personalities” of bacterial and human RNAs. (A) Meta-gene representation of averaged mRNA structure in E. coli based on SHAPE reactivity and A/U nucleotide content. (B) Per-nucleotide SHAPE reactivities for one lncRNA and six mRNAs from E. coli. (C) Model for how RNA structure modulates accessibility to regulatory sequences. Unfolding an RNA motif (∆G) imposes an energetic penalty on translation initiation (and likely many other processes). (D) RNA structures that tune translation initiation in E. coli. Brown box indicates AUG translation start site; arcs illustrate RNA base pairs. (E) Example of SERPINA1 mRNA structure containing a primary ORF and three (α, β, γ) competing upstream open reading frames (uORFs). Minimum free energy structures are shown, nucleotides are colored by SHAPE reactivity, and Kozak sequences are boxed.
Figure 8.
Figure 8.
De novo discovery of functional RNA elements as lowSS regions. (A) A lowSS region (gray box) in the rpmB mRNA forms an autoregulatory element that mimics L9/L28 binding sites in the 28S rRNA. (B) Novel RNA regulatory elements identified in E. coli. Structures are annotated by SHAPE reactivity and evidence for conservation. (C) Conservation of lowSS structures identified in enterobacteria and evidence of function based on literature. (D) Mechanisms by which RNA structure regulates gene expression across the E. coli transcriptome based on identification of well-determined secondary structures. Arcs indicate base pairs.
Figure 9.
Figure 9.
Discovery of higher-order RNA structure. (A) Well-determined secondary structure elements across the first half of the DENV2 genome based on the lowSS metric. (B) RING correlated chemical probing analysis of three RNA elements with well-determined structures and higher-order folds. Elements are named by the gene in which they occur. (C) Effect of mutations in regions with significant RING correlations on (top) global compaction of the DENV2 RNA genome (measured by dynamic light scattering) and on (bottom) replication fitness (visualized by immunostaining of DENV2 envelope protein and nuclei). (D) Modeled three-dimensional folds of three elements in the DENV2 RNA based on RING correlations used to restrain discrete molecular dynamics simulations.

Similar articles

Cited by

References

    1. Siegfried NA; Busan S; Rice GM; Nelson JAE; Weeks KM RNA Motif Discovery by SHAPE and Mutational Profiling (SHAPE-MaP). Nature Methods 2014, 11, 959–965. - PMC - PubMed
    1. Mustoe AM; Busan S; Rice GM; Hajdin CE; Peterson BK; Ruda VM; Kubica N; Nutiu R; Baryza JL; Weeks KM Pervasive Regulatory Functions of mRNA Structure Revealed by High-Resolution SHAPE Probing. Cell 2018, 173, 181–195.e18. - PMC - PubMed
    1. Dethoff EA; Boerneke MA; Gokhale NS; Muhire BM; Martin DP; Sacco MT; McFadden MJ; Weinstein JB; Messer WB; Horner SM; Weeks KM Pervasive Tertiary Structure in the Dengue Virus RNA Genome. Proc Natl Acad Sci U S A 2018, 115, 11513–11518. - PMC - PubMed
    1. Sharp PA The Centrality of RNA. Cell 2009, 136, 577–580. - PubMed
    1. Atkins JF; Gesteland RF; Cech T RNA Worlds: From Life’s Origins to Diversity in Gene Regulation; 2011.

Publication types

LinkOut - more resources