Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May;18(5):507-519.
doi: 10.1038/s41592-021-01128-0. Epub 2021 May 7.

Robust single-cell discovery of RNA targets of RNA-binding proteins and ribosomes

Affiliations

Robust single-cell discovery of RNA targets of RNA-binding proteins and ribosomes

Kristopher W Brannan et al. Nat Methods. 2021 May.

Abstract

RNA-binding proteins (RBPs) are critical regulators of gene expression and RNA processing that are required for gene function. Yet the dynamics of RBP regulation in single cells is unknown. To address this gap in understanding, we developed STAMP (Surveying Targets by APOBEC-Mediated Profiling), which efficiently detects RBP-RNA interactions. STAMP does not rely on ultraviolet cross-linking or immunoprecipitation and, when coupled with single-cell capture, can identify RBP-specific and cell-type-specific RNA-protein interactions for multiple RBPs and cell types in single, pooled experiments. Pairing STAMP with long-read sequencing yields RBP target sites in an isoform-specific manner. Finally, Ribo-STAMP leverages small ribosomal subunits to measure transcriptome-wide ribosome association in single cells. STAMP enables the study of RBP-RNA interactomes and translational landscapes with unprecedented cellular resolution.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicts of interests

GWY is co-founder, member of the Board of Directors, on the SAB, equity holder, and paid consultant for Locana and Eclipse BioInnovations. GWY is a visiting professor at the National University of Singapore. GWY’s interests have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. The authors declare no other competing financial interests.

Figures

Extended Data Fig. 1
Extended Data Fig. 1. RBP-STAMP reproducibility and concordance with eCLIP, related to Figure 1.
A) Irreproducible Discovery Rate (IDR) analysis comparing ≥ 0.5 confidence edit windows for increasing levels of RBFOX2-STAMP at 24, 48 and 72 hours. B) Differential expression (DEseq2) analysis of RBFOX2-STAMP for increasing levels of RBFOX2-STAMP at 72 hours. C) Fraction of RBFOX2-APOBEC1 eCLIP peaks overlapping low and high induction RBFOX2-STAMP edit sites at increasing expression (TPM) thresholds. D) STAMP edit-site filtering and cluster-calling workflow. E) Number of control- and RBFOX2-STAMP edit sites and clusters retained after each filtering step in D. F) Cumulative distance measurement from RBFOX2-STAMP distal edit-clusters to eCLIP peaks on targets genes. G) Pie chart showing the proportion of N-terminally fused RBFOX2-APOBEC1 STAMP edit-clusters overlapping with either 1) RBFOX2-APOBEC1 N-terminal fusion high-confidence eCLIP peaks (l2fc>2 and l10p>3 over input) containing the conserved RBFOX2 binding motif (GCAUG), 2) equally stringent eCLIP peaks not containing the conserved motif, 3) the conserved motif falling outside of eCLIP peaks, or 4) neither eCLIP peaks nor conserved motifs. H) Quantification of expression from no dox (0ng/ml) low (50ng/ml) or high (1μg/ml) doxycycline induction of SLBP-APOBEC1 and TIA1-APOBEC1 fusions compared to endogenous expression. I) Irreproducible Discovery Rate (IDR) analysis comparing 0.5 ≥ confidence level edit windows for increasing levels of TIA1-STAMP at 72 hours. J) Fraction of SLBP eCLIP peaks (log2fc>2 and -log10p>3 over size-matched input, reproducible by IDR) with SLBP-STAMP edit-clusters, compared to size-matched shuffled regions, calculated at different edit site confidence levels before and after site filtering (see Materials and Methods for filtering procedure). Numbers atop bars are Z-scores computed comparing observed with the distribution from random shuffles. *** denotes statistical significance at p = 0, one-sided exact permutation test. K) Fraction of TIA1-APOBEC1 eCLIP peaks (log2fc>2 and -log10p>3 over size-matched input) with TIA1-STAMP edit-clusters, compared to size-matched shuffled regions, calculated at different edit site confidence levels before and after site filtering (see Materials and Methods for filtering procedure). Numbers atop bars are Z-scores computed comparing observed with the distribution from random shuffles. *** denotes statistical significance at p = 0, one-sided exact permutation test. L) Motif enrichment using HOMER and shuffled background on TIA1-STAMP edit-clusters.
Extended Data Fig. 2
Extended Data Fig. 2. Ribo-STAMP reproducibility and response to mTOR pathway perturbations, related to Figure 2.
A) Quantification of expression from no dox (0ng/ml) low (50ng/ml) or high (1μg/ml) doxycycline induction of RPS2-APOBEC1 fusion compared to endogenous expression. B-D) Scatterplot comparisons of CDS+3’UTR EPKM values from RPS2-STAMP replicate experiments showing high, dose-dependent correlation at 24 (B), 48 (C) and 72 hours (D). E) Scatterplot comparison of CDS EPKM values with CDS+3’UTR EPKM values for RPS2-STAMP. F) Pearson R2 values for low and high induction control- or RPS2-STAMP EPKM compared to poly-ribosome-enriched polysome-seq RPKM. G) Comparison of EPKM from vehicle treated 72-hour high-induction control-STAMP compared to Torin-1 treated 72-hour high-induction control-STAMP showing no significant signal reduction for top ribosome occupied quartile genes containing Torin-1 sensitive TOP genes as detected by ribo-seq (Q1 p = 1.0, n = 3589 genes, Wilcoxon rank-sum one-sided) and polysome profiling (Q1 p = 1.0, n = 3589 genes, Wilcoxon rank-sum one-sided). H) Scatterplot comparison of CDS+3’UTR EPKM values on ribo-seq top quartile genes (n = 3589) for Torin-1 treated and vehicle treated RPS2-STAMP 72-hour high (1μg/ml) doxycycline inductions as in Figure 2H. I) Scatterplot comparison of CDS+3’UTR RPKM values on ribo-seq quartile-1 genes (n = 3589) for Torin-1 treated and vehicle treated RPS2-STAMP 72-hour high (1μg/ml) doxycycline inductions.
Extended Data Fig. 3
Extended Data Fig. 3. Long-read STAMP reveals isoform specific binding profiles, related to Figure 3.
A) Heatmap of control- and RBFOX2-STAMP edit fractions calculated from the final exon of all detected primary and secondary alternative polyadenylation (APA) isoforms meeting coverage criteria (see materials and methods). B) IGV tracks showing RBFOX2-APOBEC1 eCLIP peaks, control- and RBFOX2-STAMP short-read edit clusters, compared to control- and RBFOX2-STAMP long-read (PB) alignments on long, middle and short APA isoforms of the target gene PIGN, with green colored C-to-U conversions on different isoforms.
Extended Data Fig. 4
Extended Data Fig. 4. Comparison of bulk STAMP to single-cell STAMP, related to Figure 4.
A) Overlap between single-cell and bulk RBFOX2-STAMP target genes containing edit-clusters. B) Fraction of RBFOX2-APOBEC1 eCLIP peaks overlapping low and high induction single-cell RBFOX2-STAMP edit-clusters at increasing expression (TPM) thresholds.
Extended Data Fig. 5
Extended Data Fig. 5. Single-cell RBP-RNA interaction detection by STAMP for multiple RBPs and in multiple cell types, related to figure 5.
A) UMAP plot using ε score from RBFOX2-STAMP and TIA1-STAMP mixture with capture sequence RBFOX2-STAMP (blue, n = 844) and TIA1-STAMP cells (red, n = 527) highlighted. B) UMAP plot as in A color-coded by Louvain clustering into RBFOX2-cluster (blue), and TIA1-cluster (red), or background-cluster (gray) populations. C) UMAP plot of gene expression for ε score Louvain clusters defined in B. D) Motif enrichment using HOMER from ≥ 0.99 confidence edits from combined RBFOX2-cluster and control-STAMP cells. E) UMAP plot showing expression of neural precursor cell markers NES, PAX6, SOX2 and DCX. F) Motif enrichment using HOMER from ≥ 0.99 confidence edits from combined control- and RBFOX2-STAMP HEK293T and NPC cells.
Extended Data Fig. 6
Extended Data Fig. 6. Single Ribo-STAMP detects ribosome occupancy from individual cells, related to Figure 6.
A) Genome-wide comparison of CDS+3’UTR EPKM values for bulk and single-cell EPKM-derived RPS2-population. B) Comparison of EPKM-derived RPS2-population CDS and CDS+3’UTR EPKM values. C) Comparison of EPKM-derived RPS2-population total mRNA RPKM values with total mRNA RPKM values from polysome-seq input. D) Comparison of EPKM-derived RPS2-population CDS+3’UTR EPKM values with total mRNA RPKM values from polysome-seq input. E) UMAP analysis of ε score from merged 72-hour high-induction RPS2-STAMP (green), control-STAMP (orange) and mixed-cell RBFOX2:TIA1-STAMP (purple) single-cell experiments. F) UMAP plot as in E with only capture sequence RBFOX2-STAMP (blue, n = 844) and TIA1-STAMP cells (red, n = 527) highlighted. D) Individual cell barcode overlap for EPKM-derived and ε score-derived RPS2-populations.
Figure 1:
Figure 1:. RBP-STAMP edits mark specific RBP binding sites.
A) Surveying Targets by APOBEC Mediated Profiling (STAMP) strategy fuses rat APOBEC1 module to an RBP of interest to deposit edits at or near RBP binding sites. C-to-U mutations from either APOBEC1-only control (control-STAMP) or RBP fusion (RBP-STAMP) can be detected by standard RNA-sequencing and quantified using our SAILOR analysis pipeline. B) Integrative genome viewer (IGV) browser tracks showing RBFOX2 and RBFOX2-APOBEC1 eCLIP peaks on the target gene APP, compared with control- and RBFOX2-STAMP signal and SAILOR quantified edit fraction for increasing induction levels of fusions (doxycycline: 0ng = none, 50ng = low, or 1μg/ml = high, 72 hours). C) IGV tracks showing 72-hour high-induction control- and RBFOX2-STAMP signal on the APP target gene at increasing confidence levels. D) RBFOX2-STAMP replicate correlations for the edited read counts per target normalized for length and coverage (EPKM). E) Quantification of expression from no dox (0ng/ml) low (50ng/ml) or high (1μg/ml) doxycycline induction of RBFOX2-APOBEC1 fusion compared to endogenous RBFOX2 expression. F) RBFOX2-STAMP and control-STAMP (background) edit frequency distribution within a 400 bp window flanking RBFOX2 eCLIP binding-site motifs, split into increasing levels of log2 fold enrichment of eCLIP peak read-density over size-matched input. G) Fraction of RBFOX2-APOBEC1 eCLIP peaks (log2fc>2 and -log10p>3 over size-matched input) with RBFOX2-STAMP edit-clusters, compared to size-matched shuffled regions, calculated at different edit site confidence levels before and after site filtering (see Materials and Methods for filtering procedure). Numbers atop bars are Z-scores computed comparing observed with the distribution from random shuffles. *** denotes statistical significance at p = 0, one-sided exact permutation test. H) Pie chart showing the proportion of filtered RBFOX2-STAMP edit-clusters overlapping either 1) RBFOX2-APOBEC1 fusion high-confidence eCLIP peaks (log2fc>2 and -log10p>3) containing the conserved RBFOX2 binding motif, 2) equally stringent eCLIP peaks not containing the conserved motif, 3) the conserved motif falling outside of eCLIP peaks, or 4) neither eCLIP peaks or conserved motifs. I) Motif enrichment using HOMER and shuffled background on RBFOX2-STAMP edit-clusters for increasing RBFOX2-STAMP induction levels. J) IGV tracks showing control- and SLBP-STAMP edit fractions at no- and high-induction (doxycycline: 0ng = none or 1μg/ml = high, 72 hours) on the target histone gene H2AC16 compared to SLBP-APOBEC1 eCLIP. K) IGV tracks showing control- and TIA1-STAMP edit fractions at no- and high-induction (doxycycline: 0ng = none or 1μg/ml = high, 72 hours) on the target gene NPM1 compared to TIA1-APOBEC1 eCLIP.
Figure 2:
Figure 2:. Ribo-STAMP edits mark highly translated coding sequences.
A) IGV browser tracks displaying coding sequence edit frequency from control, RPS2-STAMP, and RPS3-STAMP at no-induction or 72-hour high-induction on the ATP5BP gene locus. RPS3 eCLIP and input reads are shown for comparison. B) IGV browser tracks as in A on the noncoding RNA MALAT1, showing no enrichment for RPS3 eCLIP reads, RPS2- or RPS3-STAMP edits. C) Genome-wide scatterplot comparison of control- and RPS2-STAMP EPKM and ribo-seq ribosome protected fragment (RPF) RPKM for increasing levels of RPS2-STAMP. D) Comparison as in C with ribo-seq RPF RPKM and EPKM from RPS3-STAMP. E) Comparison as in C with polysome-seq RPKM and EPKM from RPS2-STAMP. F) Metagene plot showing edit (≥ 0.5 confidence score) distribution for high-induction RPS2-STAMP compared to control-STAMP and RBFOX2-STAMP across 5’UTR, CDS and 3’UTR gene regions for the top quartile (n=4,931) of ribosome occupied genes (ribo-seq). G) Metagene plot as in F showing edit (≥ 0.5 confidence level) distribution for vehicle-treated 72-hour high-induction RPS2-STAMP compared to replicate Torin-1 treated 72-hour high-induction RPS2-STAMP across 5’UTR, CDS and 3’UTR gene regions for the top quartile of ribosome occupied genes. H) Comparison of EPKM from combined replicates (n = 2) vehicle treated 72-hour high-induction RPS2-STAMP compared to Torin-1 treated 72-hour high-induction RPS2-STAMP showing significant signal reduction for top ribosome occupied quartile genes containing Torin-1 sensitive TOP genes as detected by ribo-seq (Q1 p = 1.9 e-147, n = 3589 genes, Wilcoxon rank-sum one-sided) and polysome profiling (Q1 p = 7.7 e-108, n = 3589 genes, Wilcoxon rank-sum one-sided).
Figure 3:
Figure 3:. Long-read STAMP reveals isoform specific binding profiles.
A) IGV tracks showing RBFOX2 eCLIP peak on the target gene APP, compared with 72-hour high-induction control- and RBFOX2-STAMP SAILOR quantified edit fractions for both long-read (Oxford Nanopore Technologies (ONT) or PacBio (PB)) direct cDNA, and short read (NGS) outputs. B) Homer motif analysis of RBFOX2-STAMP long-reads (ONT and PB) for edits above 0.99 confidence. C) Heatmap of control- and RBFOX2-STAMP edit fractions on the 2 primary alternative polyadenylation (APA) isoforms for the top differentially edited RBFOX2-STAMP APA targets. D) IGV tracks showing RBFOX2-APOBEC1 eCLIP peaks, control- and RBFOX2-STAMP short-read edit frequencies, and control- and RBFOX2-STAMP long-read (PB) alignments on the 2 primary isoforms of the target gene FAR1, with red colored C-to-U conversions on different isoforms.
Figure 4:
Figure 4:. STAMP allows RBP binding site detection at single-cell resolution.
A) Edit fraction comparison of bulk 72-hour high-induction control- and RBFOX2-STAMP with single-cell control- and RBFOX2-STAMP across the top 200 genes ranked by transcripts per million (TPM) from bulk RBFOX2-STAMP RNA-seq. B) IGV tracks showing the RBFOX2 eCLIP peak on the target gene UQCRH, compared with RBFOX2-STAMP edit fractions for the top 10 control- and RBFOX2-STAMP cells ranked by summed ε scores. C) Evaluation of percentage overlap between bulk and single-cell edit-clusters showing that 60–75% of single-cell edit clusters overlap bulk edit clusters over increasing cluster-flanking regions. D) Overlap between RBFOX2-APOBEC1 eCLIP target transcripts (peaks log2fc>2 and -log10p>3 over input) and single-cell RBFOX2-STAMP edit-cluster containing target transcripts. E) Pie chart showing the proportion of single-cell RBFOX2-STAMP edit-clusters overlapping either 1) RBFOX2-APOBEC1 fusion high-confidence eCLIP peaks (log2fc>2 and -log10p>3 over input) containing the conserved RBFOX2 binding motif (GCAUG), 2) equally stringent eCLIP peaks not containing the conserved motif, 3) the conserved motif falling outside of eCLIP peaks, or 4) neither eCLIP peaks nor conserved motifs. F) Cumulative distance measurement from single-cell RBFOX2-STAMP distal edit-clusters to eCLIP peaks on targets genes. G) -log10 of p-values (n = 10 trials) for motifs extracted by HOMER (v4.9.1) using RBFOX2-STAMP ≥ 0.99 confidence level edits from randomly sampled cells showing RBFOX2 motif detection to 1 cell resolution.
Figure 5:
Figure 5:. Deconvolution of multiple RBPs and cell-type specific targets.
A) Uniform Manifold Approximation and Projection (UMAP) analysis of gene expression from merged 72-hour high-induction control- and RBFOX2:TIA1-STAMP cells with capture sequence RBFOX2-STAMP (blue, n = 844) and TIA1-STAMP cells (red, n = 527) highlighted. B) UMAP analysis using ε score rather than gene expression after merging 72-hour high-induction control-STAMP cells (orange). C) UMAP plot as in B color-coded by ε score Louvain clustering into RBFOX2-population (blue), TIA1-population (red) and background-population (gray) populations with control-STAMP cells (orange) overlaid. D) Heatmap of normalized ε score signatures for RBFOX2- and TIA1-population cells compared to control-STAMP and background cells on the top 25 differentially edited gene targets. E) IGV browser tracks showing SAILOR quantified edit fractions for the top 5 control-, RBFOX2-, and TIA1-STAMP cells (ranked by summed ε scores) on the NPM1, BTF3 and CFL1 gene targets. F) UMAP analysis of merged 72-hour high-induction RBFOX2-STAMP mixed NPC and HEK293T cells clustered by expression. G) UMAP analysis as in F using ε score. H) ε score distribution summarized by violin plot for HEK293T and NPC defined cell populations for the top differentially edited genes. I) Violin plots as in H summarizing expression rather than ε score. J) IGV browser tracks showing edit fractions and read coverage for the top 5 control- and RBFOX2-STAMP cells (ranked by summed ε scores) on the RPL14 and RPL13A gene targets.
Figure 6:
Figure 6:. Ribo-STAMP reveals ribosome occupancy from individual cells.
A) UMAP analysis of EPKM for 72-hour high-induction RPS2-STAMP (green), control-STAMP (orange). B) UMAP analysis of cells shown in A with EPKM Louvain clustering into background-population and RPS2-population. C) Comparison of EPKM-derived RPS2-population CDS+3’UTR EPKM values with poly-ribosome-fraction-enriched polysome-seq RPKM values. D) UMAP plot color-coded by ε score Louvain clustering into background-cluster (orange), RBFOX2-cluster (blue), TIA1-cluster (red), and 677 RPS2-cluster (green) from merged 72-hour high-induction STAMP experiments. E) Comparison of ε score-derived RPS2-population CDS+3’UTR EPKM values with poly-ribosome-fraction-enriched polysome-seq RPKM values. F) Metagene plot showing distribution for aggregate cell edits (≥ 0.5 confidence level) from control-STAMP, RPS2-cluster, TIA1-cluster, and RBFOX2-cluster cells across 5’UTR, CDS and 3’UTR gene regions for the top quartile of ribosome occupied genes. G) Heatmap of normalized ε score signatures for RPS2-population, RBFOX2-population, and TIA1-population cells compared to background cells on the top 15 differentially edited gene targets. H) IGV browser tracks showing edit fractions for the top 10 control-, RPS2-, RBFOX2-, and TIA1-STAMP cells (ranked by summed ε scores) on the RPL12, RPL30 and RPL23A gene targets.

References

    1. Singh G, et al. , The Clothes Make the mRNA: Past and Present Trends in mRNP Fashion. Annu Rev Biochem, 2015. 84: p. 325–54. - PMC - PubMed
    1. Gerstberger S, Hafner M, and Tuschl T, A census of human RNA-binding proteins. Nat Rev Genet, 2014. 15(12): p. 829–45. - PMC - PubMed
    1. Van Nostrand EL, et al. , Principles of RNA processing from analysis of enhanced CLIP maps for 150 RNA binding proteins. Genome Biol, 2020. 21(1): p. 90. - PMC - PubMed
    1. Martinez FJ, et al. , Protein-RNA Networks Regulated by Normal and ALS-Associated Mutant HNRNPA2B1 in the Nervous System. Neuron, 2016. 92(4): p. 780–795. - PMC - PubMed
    1. Ramanathan M, Porter DF, and Khavari PA, Methods to study RNA-protein interactions. Nat Methods, 2019. 16(3): p. 225–234. - PMC - PubMed

Publication types