Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct 1;60(1):131-45.
doi: 10.1016/j.molcel.2015.08.015. Epub 2015 Sep 24.

The Menu of Features that Define Primary MicroRNAs and Enable De Novo Design of MicroRNA Genes

Affiliations

The Menu of Features that Define Primary MicroRNAs and Enable De Novo Design of MicroRNA Genes

Wenwen Fang et al. Mol Cell. .

Abstract

MicroRNAs (miRNAs) are small regulatory RNAs processed from stem-loop regions of primary transcripts (pri-miRNAs), with the choice of stem loops for initial processing largely determining what becomes a miRNA. To identify sequence and structural features influencing this choice, we determined cleavage efficiencies of >50,000 variants of three human pri-miRNAs, focusing on the regions intractable to previous high-throughput analyses. Our analyses revealed a mismatched motif in the basal stem region, a preference for maintaining or improving base pairing throughout the remainder of the stem, and a narrow stem-length preference of 35 ± 1 base pairs. Incorporating these features with previously identified features, including three primary-sequence motifs, yielded a unifying model defining mammalian pri-miRNAs in which motifs help orient processing and increase efficiency, with the presence of more motifs compensating for structural defects. This model enables generation of artificial pri-miRNAs, designed de novo, without reference to any natural sequence yet processed more efficiently than natural pri-miRNAs.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Design of pri-miRNA pools and high-throughput analyses of variants. See also Figure S1
(A) Secondary structure of the three parental pri-miRNAs. miRNA–miRNA* duplexes are colored red. Blue numbers above 5p sequences indicate randomized windows, most of which contained three nucleotides on the 5p arm and the corresponding nucleotides on the 3p arm (B) Schematic of the protocol for generating each dictionary of barcoded variants and quantifying the amount of each variant that was in the input and that was cleaved at each site. (C) Distribution of unique barcodes per variant in each dictionary. (D) Distributions of reads per variant in the input sequencing. (E) Distributions of cleavage sites for the sequenced 5′-cleavage fragments. (F) Distributions of cleavage scores.
Figure 2
Figure 2. Structural preferences across the stem. See also Figure S2
(A) Cleavage scores for all 4096 variants of pri-miR-16 within randomized window 2 (grey shaded). Each row shows the scores of the indicated 5p trinucleotide (written 5′ to 3′), and each column shows the scores of the indicated 3p trinucleotide (written 3′ to 5′), colored according to the key (right). The asterisk marks the wild-type sequence. (B) Base-pairing scores at each position across each of the indicated stems. (C) Detrimental effects of maintaining or strengthening the apical pairing of pri-miR-16. Each 4 × 4 heat map shows the scores of all 16 single-bp variants at the shaded position in the context of wild-type nucleotides at all other positions, colored according to the key (below). Each asterisk marks the wild-type sequence. (D) Beneficial effects of pairing at position 2 and of maintaining the UGU motif in the apical region of pri-miR-30. Otherwise, this panel is as in (C). (E) The modest effects of changing or eliminating each of the bulges normally found in each of the three pri-miRNAs. The heat map shows the cleavage scores of the indicated variants in the context of wild-type nucleotides at all other positions (−, no bulge), colored according to the key (below). Because pri-miR-16 had a single-nucleotide bulge, dinucleotide variants were not tested (grey). Each asterisk marks the wild-type sequence.
Figure 3
Figure 3. A broadly conserved mismatched GHG motif enhances pri-miRNA processing. See also Figure S3 and Tables S2 and S3
(A) Nucleotide pairs preferred at the three positions of the mismatched GHG motif. Shown is the relative fraction of each nucleotide pair observed in the top 1% of the variants generated from randomizing positions 7–9 of the three pri-miRNAs (Table S2). For each pair the first letter indicates the 5p nucleotide, and the second letter the 3p nucleotide. (B) Primary-sequence preferences within the mismatched GHG motif. Shown is a pLogo, which represents the nucleotide enrichment and depletion observed at the indicated positions within the top 41 variants generated from randomizing positions 7–9 (Table S2), compared to the background of all 4096 possible variants at these positions (O’Shea et al., 2013). Red lines indicate P-value threshold of 0.05. (C) Enrichment of the mismatched GHG motif in natural miRNAs. The mismatched GHG motif was defined as a 3-bp structural element in which the first pair could be C–G or U–G, the second could be one of the seven mismatches or pairs shown in panel A, and the third could be any Watson–Crick base pair. The heat map shows the frequency of the motif observed at the indicated position within the stems of representative pri-miRNA from the indicated species (Table S3). The asterisk indicates species with a significant enrichment at position 7 (P < 0.05, one-tailed binomial test with Bonferroni correction). (D) Increased cleavage efficiency imparted by the mismatched GHG motif. The gel (center) shows results of competitive-cleavage assays that determined the relative cleavage of pri-miR-125 variants 1–5, which had the indicated substitutions within the mismatched GHG motif (center left table). The wild-type (WT) hairpin with mismatched GHG motif at positions 7–9 (blue shading) is shown for reference (upper left). As schematized (lower left), each assay included the query variant, which generated a 39 nt labeled product, and a longer pri-miR-125 wild-type reference substrate, which generated a 69 nt labeled product. The graph (right) shows the mean relative cleavage efficiency of each variant, normalized to that of the wild-type (blue bars; error bars, s.e.m., n = 3), compared to the value determined from the high-throughput sequencing experiment (orange bars). (E) Increased miRNA accumulation imparted by the mismatched GHG motif in HEK293T cells. The mismatched GHG motif was tested in the context of pri-miR-44.3 (bottom left), a derivative of C. elegans pri-miR-44 with a U substitution (blue) that increases processing, presumably because it destabilizes pairing beyond the basal stem and introduces a basal UG motif (Auyeung et al., 2013). The variants (center table) introduced either the mismatched GHG motif (variant 1) or control sequences. RNA blots (top right) examined miR-44 accumulation in cells for each variant when expressed as a query pri-miRNA on the same primary transcript as pri-miR-1, as schematized (top left). The graph (bottom right) plots relative levels of mature miR-44 after normalizing to the miR-1 internal reference (mean ± s.e.m., n = 3; **, P ≤ 0.01, one-tailed Student’s t-test).
Figure 4
Figure 4. De novo designed pri-miRNAs are processed efficiently and accurately in vitro and in cells. See also Figure S4
(A) Guidelines for de novo design of pri-miRNAs. Motif residues are highlighted (blue). PolyU segments, which disfavor pairing, and other constrained sequences, some of which favor loading into Argonaute, are purple (W = A or U), and randomly assigned residues or pairs are grey (N = A, C, G, or U). (B) Sequences of three artificial pri-miRNAs (A1, A2 and A3) and their variants in which the motifs were disrupted (A1.1, A2.1 and A3.1, green substitutions). Motif residues are highlighted (blue), and residues of the miRNA duplex are red. (C) In vitro cleavage efficiencies of artificial pri-miRNAs, comparing variants with and without all motifs and those with and without the mismatched GHG motif. Variants with and without all motifs are shown in (B); A1.2, A2.2, A3.2 each have the mismatched GHG motif as the only motif, and A1.3, A2.3, A3.3 each have all the motifs except the mismatched GHG motif. Plotted in blue are mean cleavage efficiencies at the correct site relative to the pri-miR-125 internal reference, determined as in Figure 3D (error bars, s.e.m, n = 3; ***, P ≤ 0.001; **, P ≤ 0.01; *, P ≤ 0.05; n.s., P > 0.05; one-tailed Student’s t-test). If miscleavage was detected, its efficiency was similarly plotted in grey. See Figure S4A for images of competitive-cleavage results. (D) Accumulation of mature artificial miRNAs in HEK293T cells, comparing variants with and without all motifs and those with and without the mismatched GHG motif. Assays were as in Figure 3E; artificial pri-miRNA variants and evaluation of statistical significance were as in (C). (E) miRNA yield from artificial pri-miRNAs relative to that from natural pri-miRNAs. As schematized (left), each artificial pri-miRNA was transcribed between the pri-miR-30 and pri-miR-1 internal references as the query pri-miRNA. Plotted are the relative levels of mature miRNAs, determined using quantitative RNA blots (mean ± s.e.m., n = 3). See Figure S4B for images of quantitative RNA blots.
Figure 5
Figure 5. Sequence motifs rescue suboptimal stem lengths
(A) Diagrams of extension and deletion variants of A1 and A2. Otherwise, this panel is as in Figure 4B. (B) In vitro cleavage efficiencies of the extension (40 bp) and deletion (30 bp) variants with or without motifs. Plotted are mean cleavage efficiencies relative to the pri-miR-125 internal reference, determined as in Figure 3D (error bars, s.e.m., n = 2). See Figure S5 for images of competitive-cleavage results. (C) Accumulation of mature miRNAs from the extension and deletion variants, with or without motifs, in HEK293T cells. Assays were as in Figure 3E. Mature miRNA levels relative to co-transcribed miR-1 are indicated below each lane, reporting the mean from two biological replicates. Results of Figure 4E were used to infer the ratio of A1 and A2 accumulation relative to that of miR-1, and the other values were calculated based on this ratio.
Figure 6
Figure 6. Motifs rescue structural defects. See also Figure S6 and Table S4
(A) The average effects of each pair, wobble or mismatch possibility on cleavage, compared to the frequency of that possibility in natural pri-miRNAs. Cleavage effects were determined from the high-throughput results, averaging the cleavage scores calculated from single-bp variants of pri-miRNA-125, pri-miR-16, and pri-miR-30 (orange bars, left axis). The frequency of each possibility was tallied across the 35-bp stems of representative members of 186 conserved human pri-miRNA families (Table S4; purple bars, right axis). (B) Diagrams of structural variants of A1 and A2. Motifs are highlighted (blue); miRNA duplexes are red; substituted residues are dark blue, and structure scores are in parenthesis. (C) In vitro cleavage efficiencies of structural variants of A1 and A2, with or without motifs. Plotted are mean cleavage efficiencies relative to the pri-miR-125 internal reference, determined as in Figure 3D (error bars, s.e.m., n = 2). See Figure S6B for images of competitive-cleavage results. (D) Accumulation of mature miRNAs from structural variants of A1 and A2, with or without motifs, in HEK293T cells. Assays were as in Figure 3E. Mature miRNA levels relative to co-transcribed miR-1 are indicated below each lane, as in Figure 5C. (E) Diagram of extension variants of A1.12 and A2.8. (F) In vitro cleavage efficiencies of extension variants of A1.12 and A2.8, with or without motifs. Otherwise, this panel is as in (C). See Figure S6D for images of competitive-cleavage results. (G) Accumulation of mature miRNAs from extension variants of A1.12 and A2.8, with or without motifs, in HEK293T cells. Otherwise, this panel is as in (D).
Figure 7
Figure 7. Artificial miRNAs mediate repression. See also Figure S7
(A) Sequences of artificial pri-miRNAs A4, A5 and A6. Otherwise, this panel is as in Figure 4B. (B) Response of cellular mRNAs upon co-expression of the indicated artificial miRNA and miR-1. Plotted are cumulative distributions of fold changes for mRNAs with the indicated sites in their 3′ UTRs. mRNAs with 3′-UTR sites to both miR-1 and the artificial miRNA were not considered. For each set of mRNAs, the number of reliably quantified distinct mRNAs is shown in parentheses, and for sets containing sites, the P value reports the significance of the difference in the fold-change distribution compared to that of the corresponding set of mRNAs without sites (one-tailed Mann–Whitney test).

Comment in

References

    1. Grishok A, Pasquinelli AE, Conte D, Li N, Parrish S, Ha I, Baillie DL, Fire A, Ruvkun G, Mello CC. Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control C. elegans developmental timing. Cell. 2001;106:23–34. - PubMed
    1. Hutvagner G, McLachlan J, Pasquinelli AE, Balint E, Tuschl T, Zamore PD. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science. 2001;293:834–838. - PubMed
    1. Hutvagner G, Zamore PD. A microRNA in a multiple-turnover RNAi enzyme complex. Science. 2002;297:2056–2060. - PubMed
    1. Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, Abel L, Rappsilber J, Mann M, Dreyfuss G. miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev. 2002;16:720–728. - PMC - PubMed
    1. Zeng Y, Wagner EJ, Cullen BR. Both natural and designed micro RNAs can inhibit the expression of cognate mRNAs when expressed in human cells. Mol Cell. 2002;9:1327–1333. - PubMed

Publication types

Associated data