Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 24;53(3):gkaf030.
doi: 10.1093/nar/gkaf030.

A high-resolution view of RNA endonuclease cleavage in Bacillus subtilis

Affiliations

A high-resolution view of RNA endonuclease cleavage in Bacillus subtilis

James C Taggart et al. Nucleic Acids Res. .

Abstract

RNA endonucleases are the rate-limiting initiator of decay for many bacterial mRNAs. However, the positions of cleavage and their sequence determinants remain elusive even for the well-studied Bacillus subtilis. Here we present two complementary approaches-transcriptome-wide mapping of endoribonucleolytic activity and deep mutational scanning of RNA cleavage sites-that reveal distinct rules governing the specificity among B. subtilis endoribonucleases. Detection of RNA terminal nucleotides in both 5'- and 3'-exonuclease-deficient cells revealed >103 putative endonucleolytic cleavage sites with single-nucleotide resolution. We found a surprisingly weak consensus for RNase Y targets, a contrastingly strong primary sequence motif for EndoA targets, and long-range intramolecular secondary structures for RNase III targets. Deep mutational analysis of RNase Y cleavage sites showed that the specificity is governed by many disjointed sequence features. Our results highlight the delocalized nature of mRNA stability determinants and provide a strategy for elucidating endoribonuclease specificity in vivo.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
Workflow and validation of endoribonuclease cleavage mapping approach (A) Deletion of exoribonucleases results in stable accumulation of RNA decay intermediates. A schematic mRNA is shown with endoribonuclease cleavage (scissors) occurring, with remaining exoribonucleases indicated with Pacman symbols. RNAs are shown with 5′-P and 3′-OH characteristic of the most common endoribonucleases in B. subtilis. Newly generated RNA ends can be mapped through RNA ligation and sequencing. Adapters used to capture RNA ends indicated in purple for 3′-end sequencing and yellow for 5′-end sequencing, with matching color scheme in schematic data below. (B) Validation of endonuclease cleavage site detection using known positions of endoribonucleolytic cleavage by RNase III, RNase Y, and Rae1 [19,21,49]. Yellow indicates 5′-end sequencing and purple indicates 3′-end sequencing. Dotted vertical line represents manually annotated cleavage positions. Plotted are reads per million CDS-mapping reads, normalized to the average 3′-mapped Rend-seq RPM in this window.
Figure 2.
Figure 2.
Cleavage within transcripts known to be destabilized by endoribonucleolytic cleavage. 5′- and 3′-end sequencing data are shown in yellow and purple, respectively. Plotted are reads per million CDS-mapping reads. Manually annotated putative cleavage sites (positions of adjacent 3′ and 5′ read density) are highlighted with arrows. Insets show highlighted sites at single-nucleotide resolution, with the Y-axis in each direction rescaled to the maximal value within the inset region. Annotated promoters and transcriptional terminators, indicated by bent arrows and lollipops, were identified using Rend-seq in SSB1002. Sequence context and RNase Y dependence of highlighted sites shown in Supplementary Fig. S2. Consistent with a recent report [50], multiple 5′- and 3′-end signals were observed in the 5′ UTR of rny.
Figure 3.
Figure 3.
B. subtilis RNase III cleaves within long-range intramolecular secondary structures (A) Identification of RNase III cleavage positions in the priA mRNA. 5′- and 3′-end sequencing data are shown in yellow and purple, respectively. Identified positions of cleavage are highlighted with a dotted vertical line. Plotted are reads per million CDS-mapping reads, normalized to the average 3′-mapped Rend-seq RPM in this window. (B) Results of systematic identification of RNase III sites. Histogram shows distribution of endonuclease sensitivities for called peak pairs in 3′/5′-end sequencing of exoribonuclease knockouts (see Supplementary Fig. S3 and Materials and methods), with black dashed line indicating threshold for calling dependence on RNase III. 73 sites exceeded this defined threshold. For 46 sites we were unable to calculate a sensitivity score due to an absence of 5′-end sequencing counts in our knockout. These sites are called as RNase III sensitive and are counted within the “>4” bin of the histogram. (C) Predicted secondary structure of the sequence surrounding the identified RNase III sites within the priA mRNA. Positions of cleavage are indicated with arrows, with numbers corresponding to those in (A). Structural distance between these cleavage positions, defined as the length of the 3′ overhang generated by cleavage, is annotated in red. Genomic distance, defined as the distance between the cleavage sites in the primary sequence, is annotated in blue. (D) Genomic distance (as defined in C) between pairs of unambiguous RNase III cleavage positions identified within 1 kb of one another and predicted to fall on opposite sides of an RNA stem with a 2-bp spacing between cleavage positions (34 of 73 positions). (E) Structural distance (as defined in C) generated by RNase III cleavage at both positions within the pairs of cleavage positions plotted in (D). (F) Predicted secondary structures of the identified RNase III sites within the mRNAs encoding the yqeU, atpA, and asnB genes. Insets are plotted as in (A). (G) Consensus sequence-structure motif of RNase III targets in B. subtilis. The predicted structures for each pair of RNase III-sensitive cleavage sites that generate a 2-nt 3′ overhang were aligned to the positions of cleavage, “1” and “2”. The position-wise frequency of each type of base-pairing interaction is shown. Note that these frequencies sum to the fraction of sites paired at each position. The most frequent base pair at each position is shown in the schematic on the left, with lower-case characters used to designate positions for which the most frequent base pair occurs < 20% of the time. When two base pair frequencies are within 0.02 of one another, both are separated by a slash (e.g. A-U/G-C). N-N indicates 3+ base pairs are most frequent. The grey regions labeled “PB,” “MB,” and “DB” correspond to the proximal, middle, and distal boxes [59,60]. (H) Comparison of base pairing frequencies between B. subtilis and E. coli [26]. The consensus E. coli sequence-structure motif is shown on the left with top base pairs at each position which are shared between B. subtilis and E. coli highlighted in green. Red box highlights the decrease in C-G or G-C pairing in B. subtilis relative to E. coli.
Figure 4.
Figure 4.
Revision of EndoA cleavage specificity and evidence for new ribonuclease activity in B. subtilis. (AandB) 3′ (A) and 5′ (B) mapped Rend-seq signal across all UACAU motifs in the genome, separated by downstream nucleotide. 5′-end sequencing data are derived from an RNase J1 deletion (CCB434) and 3′-end sequencing from a 4-exo knockout (CCB396). Rend-seq data at each site are normalized to first (for 3′-mapped) or last (for 5′-mapped) 8 positions within the window and a position-wise mean and standard deviation (shaded interval) are calculated with 90% winsorization. Motif instances with fewer than 1 read per position or 10 reads within the normalization window are not considered. 264 to 435 (A) or 292 to 455 (B) were included per motif. The dashed vertical line indicates the position of cleavage. (CandD) 5′-mapped Rend-seq signal in an RNase J1 depletion (CCB390) (C) or and RNase J1 depletion with deletion of ndoA (BT231) (D). Analysis performed as in (Aand B). 191 to 288 (C) or 86 to 117 (D) were considered per motif. (E) Rend-seq (top) and end sequencing (bottom) data at a representative EndoA cleavage site located in the gene pgk for a 4-exo knockout (CCB396), RNase J1 depletion (CCB390), and RNase J1 depletion with an ndoA knockout (BJT231). 5′-mapped data shown in yellow (5′-end sequencing) or orange (Rend-seq) and 3′-mapped data shown in purple (3′-end sequencing) or blue (Rend-seq). A dashed vertical line indicates the position of cleavage. Plotted are reads per million CDS-mapping reads. 5′/3′-end sequencing data are normalized to the average 3′-mapped Rend-seq signal in this window. (F) Rend-seq (top) and end sequencing (bottom) data at the glmS ribozyme cleavage. 5′-mapped data shown in yellow (5′-end sequencing) or orange (Rend-seq) and 3′-mapped data in purple (3′-end sequencing) or blue (Rend-seq). Strains included are identical to panel E. A dashed vertical line indicates the position of cleavage. Data are plotted as in (E). (G) 5′-end sequencing data at the position of glmS ribozyme cleavage for knockouts of additional RNA decay-associated proteins. All knockouts are coupled to either a depletion or deletion of RNase J1. Knocked out genes include rnjA (RNase J1, strain CCB434), rny (RNase Y, strain CCB760), ylbF (YlbF, strain BJT074), rppH (RppH, strain BJT129), and rnc (RNase III, strain BG879). Data are plotted as in (E), with the exception of rppH, which is normalized to an rnjA knockout alone, rather than the same genetic background, due to a lack of corresponding Rend-seq data. (H) Rend-seq data at the position of glmS ribozyme cleavage in a 4-exo knockout (CCB396), RNase J1 knockout (CCB434), and RNase J1 + J2 deficient strain (GLB186). A dashed vertical line indicates the position of cleavage. 3′-mapped data are shown in blue and 5′-mapped data are shown in orange. Plotted are reads per million CDS-mapping reads. (IandJ) 5′-end sequencing (I) and Rend-seq 3′-mapped (J) signal across all UACAUA EndoA cleavage motifs, separated by downstream sequence. 5′-end sequencing data derived from an RNase J1 depletion with rny knockout (CCB760) and 3′-end sequencing derived from a 4-exo knockout (CCB396). The signal in a 20-nt window around each cleavage site was normalized to its maximal value and a 90% winsorized position-wise average was calculated across all normalized windows. UACAUA instances with corresponding local Rend-seq density less than 1 read per position were discarded. 10–54 sites (I) or 10–62 sites (J) were considered per row. The dashed vertical line indicates the position of cleavage by EndoA.
Figure 5.
Figure 5.
Sequence and structural features of putative RNase Y substrates. (A) Identification of putative RNase Y cleavage sites. Histogram shows endonuclease sensitivities for called peak pairs in 3′/5′-end sequencing of exoribonuclease knockouts (see Supplementary Fig. S3 and Materials and methods). Solid red line shows fit of two summed Gaussian distributions to log-transformed data; dashed red lines represent component distributions. The dashed vertical line indicates the lowest endonuclease sensitivity value (10.69) for which 95% of the Gaussian sum is derived from the right constituent distribution. Above this value sites are considered dependent on RNase Y. 669 of 1428 considered putative cleavage signatures exceed this threshold. Ninety-nine sites lacked 5′-end sequencing counts in the knockout needed to calculate a sensitivity score, and were instead deemed RNase Y sensitive and grouped within the “>4” bin. (B) Local sequence context around positions of RNase Y cleavage (N = 669). Nucleotide frequency (top), information content (middle) and k-mer (bottom) logos in a 30 nt window around called RNase Y sites. In k-mer logo, the most enriched and depleted k-mers are shown above and below, respectively. Significant positions (Bonferroni corrected P< 0.01, one-sided binomial test), are highlighted in red. (C) Local GC content around positions of RNase Y cleavage (N = 500). Sites near transcript boundaries are omitted (see Materials and methods). A dashed horizontal line indicates average %GC across B. subtilis genome and a dashed vertical line indicates position of cleavage. Yellow region indicates positions shown in (B). (D) Predicted folding near RNase Y cleavage positions (N = 500). Sites near transcript boundaries are omitted as for (C). Blue line shows position-wise average MFE predicted by RNAfold for 40-mers tiling cleavage sites. Grey band indicates per-position interquartile range MFEs from folding randomly sampled GA-dinucleotide centered windows within coding regions, excluding putative cleavage positions (see Materials and methods). A dashed vertical line indicates putative position of cleavage. (E) Identification of YlbF-dependent endoribonuclease cleavage sites. Histogram shows endonuclease sensitivities for called peak pairs in stabilized end-sequencing (see Supplementary Fig. S3 and Materials and methods). The dashed vertical line denotes threshold used to call RNase Y sites. Arrows indicate endonuclease sensitivity values of canonical processing sites within the cggR, glnR, and atpI transcripts. Nine sites lacked 5′-end sequencing counts in the knockout needed to calculate a sensitivity score and were instead deemed YlbF sensitive and grouped in the “>4” bin. (F-K) End mapping data showing evidence for inefficient but detectable endoribonucleolytic cleavage at canonical RNase Y processing sites within the transcripts encoding glnR (Fand G), cggR (Hand I), and atpI (Jand K) in the absence of YlbF. (F, H, and J) show 5′/3′-end sequencing and (G, I, and K) show Rend-seq. 5′-mapped data shown in yellow (5′-end sequencing) or orange (Rend-seq) and 3′-mapped data shown in purple (3′-end sequencing) or blue (Rend-seq). A dashed vertical line indicates positions of endoribonucleolytic cleavage. Plotted are reads per million CDS-mapping reads. 5′/3′-end sequencing data are normalized to the average 3′-mapped Rend-seq signal in this window.
Figure 6.
Figure 6.
Mutational scanning of the cggR-gapA operon cleavage site. (A) Design of MPRA construct. Mutated cleavage region is indicated in grey with yellow indicating mutated positions. Variant barcode indicated in green. Translated regions are indicated with a thick border. Promoters and transcriptional terminators are indicated by bent arrows and lollipops, respectively. (B) Workflow for measurement of mRNA processing. Protocol begins from B. subtilis culture containing a pool of genetically encoded cleavage site variants, and splits into gDNA and RNA barcode quantification protocols. The relative RNA abundance is calculated as the ratio of RNA-derived to gDNA-derived barcode reads. (C) Schematic of the cggR148 construct inserted into the aprE MPRA transcript. cggR-derived sequence is colored grey and mutated positions yellow. Translated regions are indicated with a thick border and the variant barcode with green. (D) Impact of all single-nucleotide mutations on the accumulation of barcoded full-length RNA. Boxplots show variation between barcodes of identical variant sequence. Number of barcodes captured for each mutation is indicated above plot. Whiskers indicate 5th and 95th percentile. Grey shaded region indicates interquartile range for variants of wild-type sequence. Dashed vertical line indicates position of cleavage. Blue arrows highlight mutations that are close the bulge in the cggR downstream hairpin (illustrated in F). Four variants (≤1 per mutation) have a value of zero are thus not visualized. (E) Impact of all single-nucleotide mutations on the accumulation of barcoded cleavage product. Boxplots show variation between barcodes of identical variant sequence. Number of barcodes captured for each mutation (N) is indicated by the bar height (top). Whiskers indicate 5th and 95th percentile. Grey shaded region indicates interquartile range for variants of wild-type sequence. Dashed vertical line indicates position of cleavage and brown bars indicate regions predicted to pair in the formation of a downstream stem-loop structure (illustrated in F). Red arrows indicate mutations predicted to generate G•U wobble pairing, and the blue arrow highlights a mutation which corrects the bulge in this structure. Four variants (≤1 per mutation) have a value of zero are thus not visualized. (F) Predicted stem-loop structure at 5′-end of processed RNA. Inset shows impact of double mutants on accumulation of processed RNA. Double mutant averages calculated over any variants with the two indicated mutations. Any variants which contain additional mutations that on their own result in greater than 10% change in processed RNA accumulation are discarded. (G) Relationship between predicted strength of downstream secondary structure and accumulation of barcoded cleavage product. The experiment each data point is derived from is denoted with a triangle (experiment 2b, Supplementary Table S7), circle (4a), or square (6). The vertical dashed line indicates the ΔG of the unmutated sequence and horizontal dashed line indicates the median abundance of unmutated variants. Five variants fall above the bounds of this plot (Supplementary Table S8). Blue arrows indicate bulge-closing mutation highlighted in (E). (H) Relationship between predicted strength of downstream secondary structure and accumulation of barcoded full-length RNA. All data derived from experiment 2a (Supplementary Table S7). The vertical dashed line indicates the ΔG of the unmutated sequence and horizontal line indicates the median abundance of unmutated variants. Five variants fall above the bounds of this plot (Supplementary Table S9). Blue arrows indicate bulge-closing mutations highlighted in (D).
Figure 7.
Figure 7.
Determinants of endoribonuclease cleavage identified in this work. (A) Summary of the identified cleavage determinants of the three endoribonucleases profiled in this work. RNase III cleaves within double-stranded RNA sequences, including many duplexes formed through long range (>100 nt) intramolecular interactions, leaving a 2-nt 3′ overhang (left, ‘III’). EndoA cleaves specifically at a UACAUA primary sequence motif (right, ‘EndoA’). RNase Y cleavage appears to target AU-rich regions of low secondary structure, preferring cleavage within a G/A dinucleotide (center, ‘Y’). Interrogation of the cleavage sites within cggR and glnR suggest that a -1G is dispensable for RNase Y cleavage, and for a given site, the sequence and structural elements that drive cleavage ("distributed sequence elements") may be distributed over many tens of nucleotides. Further, for some mRNA processing sites such as those in cggR or gapA, a downstream stem-loop structure ("stabilizing hairpin") appears to primarily drive stabilization of the cleaved isoform.

References

    1. Taggart JC, Lalanne JB, Li GW. Quantitative control for stoichiometric protein synthesis. Annu Rev Microbiol. 2021; 75:243–67. - PMC - PubMed
    1. Brewster RC, Jones DL, Phillips R. Tuning promoter strength through RNA polymerase binding site design in Escherichia coli. PLoS Comput Biol. 2012; 8:e1002811. - PMC - PubMed
    1. Urtecho G, Tripp AD, Insigne KD et al. Systematic dissection of sequence elements controlling σ70 promoters using a genomically encoded multiplexed reporter assay in Escherichia coli. Biochemistry. 2019; 58:1539–51. - PMC - PubMed
    1. Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol. 2009; 27:946–50. - PMC - PubMed
    1. Salis HM. The ribosome binding site calculator. Methods Enzym. 2011; 498:19–42. - PubMed

LinkOut - more resources