Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug;39(8):978-988.
doi: 10.1038/s41587-021-00874-y. Epub 2021 Apr 15.

Quantitative mapping of the cellular small RNA landscape with AQRNA-seq

Affiliations

Quantitative mapping of the cellular small RNA landscape with AQRNA-seq

Jennifer F Hu et al. Nat Biotechnol. 2021 Aug.

Abstract

Current next-generation RNA-sequencing (RNA-seq) methods do not provide accurate quantification of small RNAs within a sample, due to sequence-dependent biases in capture, ligation and amplification during library preparation. We present a method, absolute quantification RNA-sequencing (AQRNA-seq), that minimizes biases and provides a direct, linear correlation between sequencing read count and copy number for all small RNAs in a sample. Library preparation and data processing were optimized and validated using a 963-member microRNA reference library, oligonucleotide standards of varying length, and RNA blots. Application of AQRNA-seq to a panel of human cancer cells revealed >800 detectable miRNAs that varied during cancer progression, while application to bacterial transfer RNA pools, with the challenges of secondary structure and abundant modifications, revealed 80-fold variation in tRNA isoacceptor levels, stress-induced site-specific tRNA fragmentation, quantitative modification maps, and evidence for stress-induced, tRNA-driven, codon-biased translation. AQRNA-seq thus provides a versatile means to quantitatively map the small RNA landscape in cells.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Overview of AQRNA-seq.
(a) Library preparation workflow. (b-d) Optimization experiments for the library preparation workflow. (b) Linker 1 ligation and removal proceeds with >90% efficiency at a linker:tRNA molar ratio of 50:1. Dot plot shows all data with bars for mean and SD, N=3. (c) AlkB demethylation efficiencies for RNA modifications. The data represent percent reduction for a single experiment (N=1). (d) Linker 2 ligation is nearly 100% efficient at linker to tRNA molar ratios ≥30:1. A 50:1 ratio is used here, with linker 2 removal nearly ~100% efficient. Dot plot shows all data with bars for mean and SD, N=3. (e) Data processing workflow for bacterial tRNAs. Reads are mapped against a non-redundant reference genome. The paired-end protocol of AQRNA-seq yields two FASTQ files per library – one each for forward and reverse reads. After alignment, multiply-and uniquely-mapped reads are separated and mined for abundance and coverage information. (f) Data processing workflow for human miRNAs. While bacterial reads are mapped first and then counted according to the mapped RNA species, human inserts are counted first and then blasted to RNA reference sequences or mapped to the entire genome for annotation. Random sequencing errors are corrected and read pairs cross-validated by assembling paired-end forward and reverse reads before counting.
Figure 2.
Figure 2.. Quantitative validation of AQRNA-seq.
(a) Oligonucleotide spike-ins demonstrate a linear relationship between copy number and read count. Oligonucleotides 25–80 nt long were subjected to AQRNA-seq at different concentrations (GEO accession GSE139936). Dot plots show all data, bar denoting mean, for N=3 experiments. (b) Minimal sequence bias in AQRNA-seq analysis of the 963 miRNA Miltenyi miRXplore Universal Reference (GEO accession GSE139936). Among measured miRNAs, the 5’ and 3’ nucleotides were tabulated and their proportions plotted. Dot plot shows data for N=3 experiments on Day 1 and N=1 experiment on Day 2, with a dash denoting expected proportions of A, C, G, and U at each end among all 963 reference miRNAs. (c) AQRNA-seq quantitative fidelity was assessed using the miRXplore Reference (GEO accession GSE139936). Sequencing reads for each miRNA were normalized to expected values and sorted into 5 bins as denoted in the graph. The colored bar indicates the percentage of reads within 2- and 10-fold of expected abundance. (d) Comparison of the quantitative accuracy of miRNA libraries prepared from the miRXplore Reference using the AQRNA-seq (“AQ”) protocol and the following small RNA or miRNA library kits: Illumina TruSeq (ILM), Lexogen (LEX), NEBNext for Illumina (NEB), Perkin Elmer NextFlex (PEB), QIAseq miRNA (QIA), and Trilink CleanTag (TRI). Data for the kits was derived from Herbert et al. For each replicate and for each kit, the percentage of total miRNAs found in each bin denoted in panel c was calculated. Dot plot shows data for N=3 experiments (ILM, LEX, TRI, AQ) or N=4 (NEB, PEB, QIA), with bars denoting mean ± SD. (e) Among AQRNA-seq and the other RNA-seq kits, a positive correlation exists between the average number of sequence variants detected and the percentage of miRNAs quantified within 2-fold of expected value. Sequence variants are defined as additions and subtractions to the insert sequences during library preparation; see Supplementary Figure 2 for the set of sequence variants arising for the kits. Data represent mean ± SD for N=3 experiments (ILM, LEX, TRI, AQ) or N=4 (NEB, PEB, QIA). (f) Correlation of tRNA quantification results using AQRNA-seq versus data derived from 2D gel electrophoresis and northern blotting by Dong et al. Data represent individual values derived from Dong et al. plotted relative to mean values from N=3 AQRNA-seq analyses of E. coli tRNAs.
Figure 3.
Figure 3.. Alignment plots of M. bovis BCG tRNAs.
(a) Alignment plot showing the start and end position of reads aligning to tRNA Gly-CCC in a stacked horizontal bar graph. The tRNA sequence numbering allows positions 1 and 76 to reflect the 5’ and 3’ termini, respectively. * Anticodon indicated by three vertical lines. (b) Schematic showing linker 1 attachment and reverse transcription along the tRNA sequence (Sprinzl coordinate system). (c) Aligned reads fall into three categories with different interpretations – see text. (d) Example alignment plot for tRNA Tyr-GTA showing polymerase blockage downstream of the anticodon resulting in a lack of Type 3 (full-length) reads. (e) Example of an alignment plot to tRNA Glu-TTC showing polymerase blockage near the anticodon and enrichment of fragments aligning inside the 3’-end, resulting in increased Type 1 reads. BCG AQRNA-seq data available in BioProject #PRJNA579244.
Figure 4.
Figure 4.. Starvation-induced changes in tRNA abundance correlate with changes in codon-biased translation in M. bovis BCG.
(a) Plots show normalized abundance of selected tRNAs across the starvation time course (S0, nutrient-rich medium; S4–20, 4–20 d starvation; R6, 6 d resuscitation in nutrient-rich medium). Inset: Normalized abundance of tRNA-Thr isoacceptors. Data represent mean ± SD for N=3 experiments. Individual data omitted for clarity. (b) Upper: Time courses for changes in abundance of tRNA-Thr isoacceptors with anticodons CGU and GGU. Dot plot shows data for N=3 experiments with bars for mean ± SD. Lower: Codon usage in mRNAs for the 25 most upregulated proteins at 30 d starvation. ACG and ACC are cognate codons for tRNA-Thr isoacceptors with anticodons CGU and GGU, respectively, noted in the upper panel. Dot plot with bars for mean ± SD shows Z-scores for codon usage relative to genome averages for the mRNAs for the 25 most upregulated proteins. (c) Plots showing abundances of individual isoacceptors (all reads aligning at the 3’-end relative to the total set of tRNAs that carry the same amino acid). Data represent mean ± SD for N=3 experiments. BCG AQRNA-seq data available at BioProject #PRJNA579244.
Figure 5.
Figure 5.
Application of AQRNA-seq for quantitative mapping of the tRNA epitranscriptome. Many RNA modifications block 3’-to-5’ reverse transcriptase-mediated cDNA synthesis or result in mutations in AQRNA-seq (gray line and arrow in panel b). This can be exploited to quantitatively map the modifications. (a) Subsets of E. coli tRNAs exhibit similar reverse transcriptase blockages at positions 38 and 48. The heat map shows the percentage of sequencing reads for which reverse transcription ended at a specific location in the tRNA sequence (columns) for each tRNA isoacceptor (rows). The two positions showing the most significant accumulations of polymerase blockade are noted with a small orange or red box. The RNA sequences surrounding these positions are shown in the larger boxes that magnify the sequence location, with specific modified nucleotides noted based on existing maps of E. coli tRNA modifications. In the orange boxes, the 8 tRNA species showing polymerase blockade at position 38 reveals all possess i6A at position 37. In the red boxes, the 10 tRNAs showing polymerase blockade at position 48 all possess acp3U at position 47. (b) In the absence of AlkB treatment, cDNA synthesis is blocked by m1A at position 58 in nearly half of all BCG tRNAs, which is reflected by the high proportion of aligned reads that do not extend past position 58 in the heatmap of read start positions (light blue to orange line in heat map similar to panel a). This is illustrated for tRNA Glu-CTC in the gray stack plot, which shows that early all the reads begin after position 58, forming a “cliff”. (c) After AlkB demethylation, however, the read alignments lengthen and extend past the cliff, resulting in a more varied distribution of alignment start positions. The heat map shows a significant reduction in the number of aligned reads. (d) Many RNA modifications can also be mapped by polymerase-induced mutations in the resulting cDNA. This is illustrated with a striking T-to-C misreading in BCG tRNA Arg-ACG, which is consistent with the presence of inosine at position 34 of the anticodon on nearly all copies of the tRNA. BCG AQRNA-seq data are available in BioProject accession number PRJNA579244.
Figure 6.
Figure 6.. AQRNA-seq analysis of miRNAs in HMEC cancer cells.
(a) The HMEC model for progressive tumorigenesis. Primary HMEC cells were immortalized with SV40 large-T antigen and the telomerase catalytic subunit (HMEC1), with subsequent tumorigenic behavior induced by H-Ras oncoprotein (HMEC2) and additional loss of P53 (HMEC3). (b) AQRNA-seq reveals a 5 order-of-magnitude range in levels of 875 miRNAs in the HMEC cell lines. Data represent the averaged read count across 5 different cell cultures for each miRNA. Error bars omitted for clarity. The X-axis order of presentation of individual miRNAs is prioritized by decreasing frequency for HMEC1. (c) PLSR analysis of the abundance 875 miRNAs associated with the HMEC cells. Left: scores plot showing strong distinctions among the cell lines. Right: loadings plot showing the miRNAs most significantly distinguishing the three HMEC cells. (d) Changes in the levels of 14 miRNAs strongly associated with the HMEC cell lines from the analysis in panel c. miRNAs increasing from HMEC1 to HMEC 3: 15a-5p, 19a-3p, 4454. miRNAs decreasing from HMEC1 to HMEC3: 24–3p, 4488, 21–5p, 27a-3p. miRNAs on the right were unchanged across the HMEC cell lines. Box-and-whisker plot for N=5 experiments: whiskers, maximal and minimal data; box, 25th to 75th percentile; dash, median; and “+”, mean.

References

    1. Wang Z, Gerstein M & Snyder M RNA-Seq: a revolutionary tool for transcriptomics. Nature reviews. Genetics 10, 57–63 (2009). - PMC - PubMed
    1. Cech TR & Steitz JA The noncoding RNA revolution-trashing old rules to forge new ones. Cell 157, 77–94 (2014). - PubMed
    1. Hafner M et al.RNA-ligase-dependent biases in miRNA representation in deep-sequenced small RNA cDNA libraries. RNA 17, 1697–1712 (2011). - PMC - PubMed
    1. Zhang Z, Lee JE, Riemondy K, Anderson EM & Yi R High-efficiency RNA cloning enables accurate quantification of miRNA expression by deep sequencing. Genome Biol 14, R109 (2013). - PMC - PubMed
    1. Fuchs RT, Sun Z, Zhuang F & Robb GB Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure. PloS one 10, e0126049 (2015). - PMC - PubMed

Methods-only References

    1. Baba T et al.Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2, 2006 0008 (2006). - PMC - PubMed
    1. Hia F et al.Mycobacterial RNA isolation optimized for non-coding RNA: high fidelity isolation of 5S rRNA from Mycobacterium bovis BCG reveals novel post-transcriptional processing and a complete spectrum of modified ribonucleosides. Nucleic Acids Res 43, e32 (2015). - PMC - PubMed
    1. Li H et al.The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). - PMC - PubMed

Publication types