Structure-seq2: sensitive and accurate genome-wide profiling of RNA structure in vivo

Laura E Ritchey^{1

2}, Zhao Su³, Yin Tang⁴, David C Tack³, Sarah M Assmann³, Philip C Bevilacqua^{1

2

5}

Affiliations

¹ Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.
² Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.
³ Department of Biology, Pennsylvania State University, University Park, PA 16802, USA.
⁴ Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA.
⁵ Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.

PMID: 28637286
PMCID: PMC5737731
DOI: 10.1093/nar/gkx533

Structure-seq2: sensitive and accurate genome-wide profiling of RNA structure in vivo

Laura E Ritchey et al. Nucleic Acids Res. 2017.

. 2017 Aug 21;45(14):e135.

doi: 10.1093/nar/gkx533.

Authors

Laura E Ritchey^{1

2}, Zhao Su³, Yin Tang⁴, David C Tack³, Sarah M Assmann³, Philip C Bevilacqua^{1

2

5}

Affiliations

¹ Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.
² Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.
³ Department of Biology, Pennsylvania State University, University Park, PA 16802, USA.
⁴ Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA.
⁵ Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA.

PMID: 28637286
PMCID: PMC5737731
DOI: 10.1093/nar/gkx533

Abstract

RNA serves many functions in biology such as splicing, temperature sensing, and innate immunity. These functions are often determined by the structure of RNA. There is thus a pressing need to understand RNA structure and how it changes during diverse biological processes both in vivo and genome-wide. Here, we present Structure-seq2, which provides nucleotide-resolution RNA structural information in vivo and genome-wide. This optimized version of our original Structure-seq method increases sensitivity by at least 4-fold and improves data quality by minimizing formation of a deleterious by-product, reducing ligation bias, and improving read coverage. We also present a variation of Structure-seq2 in which a biotinylated nucleotide is incorporated during reverse transcription, which greatly facilitates the protocol by eliminating two PAGE purification steps. We benchmark Structure-seq2 on both mRNA and rRNA structure in rice (Oryza sativa). We demonstrate that Structure-seq2 can lead to new biological insights. Our Structure-seq2 datasets uncover hidden breaks in chloroplast rRNA and identify a previously unreported N1-methyladenosine (m1A) in a nuclear-encoded Oryza sativa rRNA. Overall, Structure-seq2 is a rapid, sensitive, and unbiased method to probe RNA in vivo and genome-wide that facilitates new insights into RNA biology.

PubMed Disclaimer

Figures

**Figure 1.**
Two versions of Structure-seq2 produce high quality data. In Structure-seq2, RNA (kelly green) is first modified by DMS or another chemical that can be read-out through reverse transcription. The RNA is then prepared for Illumina NGS sequencing by conversion to cDNA (Step 1A/1B, blue), ligating an adaptor (Step 3A/3B), and amplifying the products while incorporating TruSeq primer sequences (Step 5A/5B). In order to increase library quality, numerous improvements were made to the original Structure-seq protocol (boxed). These include performing the ligation with a hairpin adaptor and T4 DNA ligase (Step 3A/3B; pink) (10), and adding various purification steps to remove a deleterious by-product (Figure 2A). We present two options for purification: PAGE purification (A) or a biotin–streptavidin pull down (B). In the PAGE purification method, an additional PAGE purification step is added after reverse transcription (Step 2A). In the biotin–streptavidin pull down method, biotinylated dNTPs (cyan) are incorporated into the extended product during reverse transcription (Step 1B) and are purified via a magnetic streptavidin pull down after reverse transcription (Step 2B) and after ligation (Step 4B). There is also a common, final PAGE purification step following amplification (Step 5A/5B). Finally, a custom sequencing primer (light green) is used during sequencing (Step 7A/7B) to further provide high quality data. Supplementary Figure S1 is a version of this figure with all the nucleotides shown explicitly.

**Figure 2.**
Structure-seq2 leads to a lower ligation bias. (A) After RT (Figure 1, step 1A/1B), excess of the 27 nt primer (blue, top, right) is still present in the solution. During ligation (Figure 1, step 3A/3B), this primer can also ligate to the 40 nt hairpin adaptor (pink) to form an unwanted 67 nt by-product which has no insert and so results in sequencing reads with no utility. (B) The complement of the first nucleotide after the adaptor sequence read during sequencing is the nucleotide that ligated to the adaptor. Our new T4 DNA ligase-based method (green, –DMS and pink, +DMS) substantially decreases ligation bias as compared to the previous Circligase-based method (blue). Percentages equaling the transcriptomic distribution of the four nucleotides (black) are ideal.

**Figure 3.**
Structure-seq2 identifies a previously unreported m¹A in 25S rRNA. (A) Using the original Structure-seq method for RT denaturation (65°C with no monovalent salt), there are regions that receive no reads (denoted with arrows). (B) Increasing the denaturation conditions (90°C with monovalent salt) allows these regions to be read (denoted with color-matched arrows) and narrows regions of low read depth. Total number of reads is similar in panels a and b. Location of the large drop in reads downstream of the single region in 25S that remains absent of reads (red arrow) corresponds to a site known to contain a m¹A in yeast, human, and *H. marismortui* (C, Supplementary Figure S13) (16,18). Reads continue to decrease until they go to zero at nucleotide 539. The region between nucleotides 432 and 644 is 79% GC-rich with a read depth <100 on each nucleotide. (D) This site corresponds to a high RT stop count at the precise location in the –DMS data.

**Figure 4.**
Structure-seq2 demonstrates the presence of two hidden breaks in chloroplast rRNA. At the two locations known to harbor hidden breaks in chloroplast rRNA, the –DMS RT stop count data spike. The spike at the first hidden break (A) differs by one nucleotide from the published break site in spinach and Arabidopsis (21,28), which could be due to the slight sequence variation between species (Arabidopsis: 5′-GGGAGUGAAA*UAGAACA-3′, Rice: 5′-GGGUAGUGAAAU*AGAACG-3′, where * indicates the proposed break site). The spike at the second hidden break (B) occurs precisely at the published cleavage site for spinach and Arabidopsis (21,28).

See this image and copyright information in PMC

References

1. Bevilacqua P.C., Ritchey L.E., Su Z., Assmann S.M.. Genome-wide analysis of RNA secondary structure. Annu. Rev. Genet. 2016; 50:235–266. - PubMed
1. Kwok C.K., Tang Y., Assmann S.M., Bevilacqua P.C.. The RNA structurome: transcriptome-wide structure probing with next-generation sequencing. Trends Biochem. Sci. 2015; 40:221–232. - PubMed
1. Strobel E.J., Watters K.E., Loughrey D., Lucks J.B.. RNA systems biology: uniting functional discoveries and structural tools to understand global roles of RNAs. Curr. Opin. Biotechnol. 2016; 39:182–191. - PMC - PubMed
1. Kubota M., Tran C., Spitale R.C.. Progress and challenges for chemical probing of RNA structure inside living cells. Nat. Chem. Biol. 2015; 11:933–941. - PMC - PubMed
1. Ding Y., Kwok C.K., Tang Y., Bevilacqua P.C., Assmann S.M.. Genome-wide profiling of in vivo RNA structure at single-nucleotide resolution using structure-seq. Nat. Protoc. 2015; 10:1050–1066. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Structure-seq2: sensitive and accurate genome-wide profiling of RNA structure in vivo

Affiliations

Structure-seq2: sensitive and accurate genome-wide profiling of RNA structure in vivo

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases