Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 20;64(2):294-306.
doi: 10.1016/j.molcel.2016.08.035. Epub 2016 Oct 6.

RNA Sequence Context Effects Measured In Vitro Predict In Vivo Protein Binding and Regulation

Affiliations

RNA Sequence Context Effects Measured In Vitro Predict In Vivo Protein Binding and Regulation

J Matthew Taliaferro et al. Mol Cell. .

Abstract

Many RNA binding proteins (RBPs) bind specific RNA sequence motifs, but only a small fraction (∼15%-40%) of RBP motif occurrences are occupied in vivo. To determine which contextual features discriminate between bound and unbound motifs, we performed an in vitro binding assay using 12,000 mouse RNA sequences with the RBPs MBNL1 and RBFOX2. Surprisingly, the strength of binding to motif occurrences in vitro was significantly correlated with in vivo binding, developmental regulation, and evolutionary age of alternative splicing. Multiple lines of evidence indicate that the primary context effect that affects binding in vitro and in vivo is RNA secondary structure. Large-scale combinatorial mutagenesis of unfavorable sequence contexts revealed a consistent pattern whereby mutations that increased motif accessibility improved protein binding and regulatory activity. Our results indicate widespread inhibition of motif binding by local RNA secondary structure and suggest that mutations that alter sequence context commonly affect RBP binding and regulation.

Keywords: RBNS; RNA processing; post-transcriptional.

PubMed Disclaimer

Figures

Figure 1
Figure 1. RBP/RNA interaction measured in vitro is influenced by both RNA motif content and contextual features
A) The fraction of RBFOX2 RNA motifs (UGCAUG) identified as occupied in vivo by RBFOX2 using eCLIP, using genes binned by expression level (x-axis). The dotted line represents the observed fraction of RBFOX2 motifs that were bound in expressed introns and 3' UTRs. This fraction was then corrected to take into account the estimated sensitivity of the eCLIP assay to create a maximum expected fraction as described in supplementary Methods. Lines and shaded areas represent mean and standard deviation, respectively. B) Phylogenetic tree shows the relative evolutionary age of mouse, rat, cow, and macaque. Cassette exons (blue) were classified according to their evolutionary age of alternative splicing (Merkin et al., 2012) – see Supplemental Methods. C) Considering intronic regions downstream of cassette exons in each evolutionary age group, the percent of introns that show significant interaction with RBFOX2 in vivo in mESCs are shown. Tandem bars show introns without (−, left) or with (+, right) an RBFOX2 motif. D) Experimental design of natural sequence RBNS experiment. E) As in C, except that the y-axis signifies the fraction of oligos that were bound by RBFOX2 in vitro. F) Intronic regions corresponding to exons of younger evolutionary ages were subsampled to match the RBFOX2 motif counts in the mammalian-wide set. The average fraction of introns with RBFOX2 CLIP peaks in 50 independent subsamples is shown. Error bars show standard deviation. G) As in F, but shows MBNL1 binding of oligos subsampled to match the MBNL motif count in the mammalian-wide set. H) As in F, but showing MSI1 binding of oligos subsampled to match the MSI1 motif number in the mammalian-wide set. I-K) The cumulative distribution function of RBNS R scores is shown for intronic regions flanking constitutive (gray) and mammalian (red) exons for I) RBFOX2, J) MBNL1 and K) MSI1 RBNS experiments. Distinct line types correspond to different motif numbers for the indicated RBP. See also Figure S1.
Figure 2
Figure 2. Intronic sequences flanking exons regulated during in vivo differentiation are more often bound in vitro
A) Expression levels of Mbnl and Rbfox transcripts during the induction of mouse embryonic stem cells into glutamatergic neurons. B) Diagram of Monotonicity z-score (MZ score) definition during neuronal differentiation. Exons with negative MZ scores show consistent, monotonic decrease in inclusion while those with positive MZ scores show consistent, monotonic increase. C) Cumulative distributions of absolute monotonicity scores of mammalian, rodent and mouse skipped exons during in vitro neuronal differentiation show mammalian-wide AS exons are more likely to be developmentally regulated. D) The distribution of monotonicity scores during neuronal induction for exons flanked by in vitro bound and unbound intronic RNAs. Because MBNL sites are most active in the intronic sequence immediately upstream of skipped exons, we considered those regions for this analysis. Introns flanking exons that are regulated in vivo were more likely to be bound by MBNL1 in vitro. See also Figure S2.
Figure 3
Figure 3. RBP/RNA interactions measured in vivo are recapitulated in vitro
A) Scatter plot of the RBNS R scores for RBFOX2 motif-containing introns versus RBFOX2 CLIP-seq density in mESCs. Colors correspond to evolutionary age. B) The regulation of alternative exons during neuronal differentiation was measured and plotted as MZ scores. Introns flanking these exons were classified as bound and unbound both in vitro and in vivo. The cumulative distribution of MZ scores for each class of intronic sequence is shown. P-values are from Wilcoxon rank-sum tests between bound and unbound sequences. See also Figure S3.
Figure 4
Figure 4. Reduced basepair probabilities in and around motifs bound by RBPs in vitro
A, B) Basepair probabilities in intronic RNA oligos surrounding mammalian alternative exons are shown for sequences immediately surrounding MBNL1 (A) and RBFOX2 (B) motifs. RNA oligos have been classified based on whether or not they were bound by the RBP in vitro. Lines represent LOESS fits of the basepair probabilities while shaded areas represent the 95% confidence interval of the fit. C) Basepair probabilities were calculated as in B and then averaged across the nucleotides of each motif occurrence. Motifs were separated based on whether the RNA sequence they are contained within was bound in vitro by the indicated RBP (line type) and the evolutionary age and regulation of the neighboring alternative exon (color). See also Figure S4.
Figure 5
Figure 5. Distinct patterns of RNA mutations lead to increased RBP/RNA interaction through secondary structure rearrangements
A) Frequencies of mutations at each position across the randomly mutated Myo1b RNA oligo. The frequencies of mutations in the input, no protein control pulldown, and MBNL1 pulldown RNA pools were calculated for each position in the oligo. The frequencies in each pulldown were compared to the frequencies in the input RNA pool. Adapter and motif regions (shaded) were held constant and thus have mutation frequencies of zero. The MFE structure of the wildtype Myo1b oligo is shown above the sequence. Positions at which the MBNL1 pulldown frequency was greater than the 99th percentile of no protein control frequencies are marked with an asterisk. B) One million randomly selected Myo1b RNA oligos from each of the input, no protein pulldown, and MBNL1 pulldown libraries were computationally folded to assess secondary structure. The difference in mean basepair probabilities at each position between pulldown oligos and input oligos are shown. For each position, the statistical significance (Wilcoxon rank sum) of the basepair probability difference between the pulldown oligos (MBNL1 and no protein) and input oligos was calculated. Positions at which the P value for the MBNL1 pulldown was greater than the 99th percentile of P values for the no protein control across all positions are marked with an asterisk. C) The relative frequencies (MBNL1 pulldown / input) of co-occurring pairs of mutations in the Myo1b oligo are shown in the lower triangle. The most enriched pairs of mutations are shown here. In the upper triangle, the relative frequencies (No protein control pulldown / input) of these co-occurring mutation pairs are shown. D) As in B, one million randomly selected Myo1b RNA oligos from the input and MBNL1 pulldown libraries were computationally folded to assess secondary structure. For each (i, j) pair of positions, the mean probability across those million sequences that bases i and j were paired was calculated. The difference in mean probability between input and MBNL1 pulldown libraries is shown. E) The centroid predicted structure for the wildtype Myo1b oligo (center). Bases that have predicted basepair probabilities of greater than 0.9 or less than 0.1 are outlined in red and blue, respectively. The position of the MBNL motif is indicated with green dots. Introducing the most enriched trio of mutations from the MBNL1 pulldown (blue dots, right) results in a different predicted structure with the MBNL motif less paired. Conversely, introducing the most depleted trio of mutations from the MBNL1 pulldown (orange dots, left), resulted in a predicted structure with the MBNL1 motif more paired. See also Figure S5.
Figure 6
Figure 6. cmRBNS-enriched mutations enhance regulatory activity
A) Schematic design of splicing reporter construct based on mouse Vldlr alternative exon 16. The intronic Mbnl motif was replaced by an Mbnl motif in specific contexts derived from cmRBNS analysis of the Myo1b oligo (Fig. 5). The original Myo1b context was used along with the Enr3 and Dep3 contexts representing the most significantly enriched and depleted triplets of mutations. B) Semi-quantitative RT-PCR analysis of exon inclusion using the original, Enr3 and Dep3 contexts. MBNL levels were modulated by using Mbnl1/Mbnl2 double knockout MEFs, wildtype MEFs, and Mbnl1 overexpression MEFs. C) sqRT-PCR analysis of 3 biological replicate experiments like those in B, with 3 replicates of each PCR assay. Error bars represent standard deviation. P-values by t-test. See also Figure S6.
Figure 7
Figure 7. The appearance of new motifs and increased accessibility of the original motif are approximately equally enriched in the MBNL1 pulldown
A) Model for acquisition of RBP binding. In principle, RBP binding may arise in two ways in mutated oligos. Mutations could result in the appearance of new MBNL motifs (upper arrow), which may fall in unpaired (top) or paired (below) regions. Mutations may also result in changes to the RNA secondary structure around the original MBNL motif, promoting binding by increasing the accessibility of a pre-existing motif. B) Relative enrichments of the indicated classes of RNA sequences in the MBNL1 pulldown compared to input. Secondary structure change around nonmotif sequences was not enriched (left). However, reduced basepairing (Ppair < 0.5) of the original MBNL motif (yellow) and appearance of new MBNL motifs (blue) were enriched singly and the co-occurrence of both phenomena was strongly enriched (green). C) Appearance of new unpaired motifs (blue) was more strongly enriched than appearance of new motifs that were basepaired (gray). RNA sequences containing both an unpaired original motif and an unpaired new motif (purple) were the most enriched class. Error bars correspond to 95% confidence intervals.

Comment in

Similar articles

Cited by

References

    1. Agarwal V, Bell GW, Nam J-W, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. eLife. 2015;4:e05005. - PMC - PubMed
    1. Auweter SD, Fasan R, Reymond L, Underwood JG, Black DL, Pitsch S, Allain FH-T. Molecular basis of RNA recognition by the human alternative splicing factor Fox-1. Embo J. 2006;25:163–173. - PMC - PubMed
    1. Aviran S, Trapnell C, Lucks JB, Mortimer SA, Luo S, Schroth GP, Doudna JA, Arkin AP, Pachter L. Modeling and automation of sequencing-based characterization of RNA structure. Proc Natl Acad Sci USA. 2011;108:11069–11074. - PMC - PubMed
    1. Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. Deciphering the splicing code. Nature. 2010;465:53–59. - PubMed
    1. Carlile TM, Rojas-Duran MF, Zinshteyn B, Shin H, Bartoli KM, Gilbert WV. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature. 2014;515:143–146. - PMC - PubMed

Publication types

MeSH terms