Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jan 30:2024.01.29.577729.
doi: 10.1101/2024.01.29.577729.

Understanding species-specific and conserved RNA-protein interactions in vivo and in vitro

Affiliations

Understanding species-specific and conserved RNA-protein interactions in vivo and in vitro

Sarah E Harris et al. bioRxiv. .

Update in

Abstract

While evolution is often considered from a DNA- and protein-centric view, RNA-based regulation can also impact gene expression and protein sequences. Here we examined interspecies differences in RNA-protein interactions using the conserved neuronal RNA binding protein, Unkempt (UNK) as model. We find that roughly half of mRNAs bound in human are also bound in mouse. Unexpectedly, even when transcript-level binding was conserved across species differential motif usage was prevalent. To understand the biochemical basis of UNK-RNA interactions, we reconstituted the human and mouse UNK-RNA interactomes using a high-throughput biochemical assay. We uncover detailed features driving binding, show that in vivo patterns are captured in vitro, find that highly conserved sites are the strongest bound, and associate binding strength with downstream regulation. Furthermore, subtle sequence differences surrounding motifs are key determinants of species-specific binding. We highlight the complex features driving protein-RNA interactions and how these evolve to confer species-specific regulation.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interests The authors declare no conflicts.

Figures

Figure 1.
Figure 1.. Design and validation of natural sequence RNA bind-n-seq (nsRBNS).
A) (Venn diagram) Transcript level conservation of iCLIP UNK hits between human neuronal cells (SH-SY5Y) and mouse brain tissue. Significance determined via hypergeometric test. (Pie charts) Motif level conservation of iCLIP UNK hits between human neuronal cells (SH-SY5Y) and mouse brain tissue. B) Design of natural sequence oligo pool and layout of nsRBNS. C) Correlation plot of two experimental UNK nsRBNS replicates. Pearson’s correlation coefficient included. D) Cumulative distribution function of log2 enrichment of all oligos separated by UAG motif content. E) Scatter plot of log2 enrichment of wildtype (Y-axis) versus motif mutant (X-axis) oligos. Log2 change in enrichment (wt-mut) was calculated for each sequence pair: > 0.5 defined as bound better in wt (blue), < −0.5 defined as bound better in mut (red), 0 ± 0.5 defined as similar binding (grey).
Figure 2.
Figure 2.. Analysis of species-specific binding patterns.
A) Schematic of “control,” “orthologous,” and “bound” oligo classes used for species-specific transcript-level binding analysis. B-C) Cumulative distribution function of log2 enrichment of all iCLIP hits: control (light grey; dotted), orthologous (dark grey), and bound (teal) of B) CDS and C) UTR oligos. Significance of bound vs. orthologous was determined via KS test. Insets show boxplot of in vitro binding patterns for “bound,” “motif mutant,” and “orthologous” oligos. Significance was determined via two-sided Wilcox test. D) Cumulative distribution function of log2 fold enrichment change of in vivo bound over in vivo not bound oligos separated by ΔUAG content. E) Cumulative distribution function of RiboSeq log2 fold change separated via iCLIP detection. nsRBNS enrichment cutoffs defined as “less enrichment” <1 and “better enrichment” >1.
Figure 3.
Figure 3.. Analysis of species-specific syntenic motif level binding patterns.
A) Definition of “conserved,” “bound elsewhere,” and “not bound” oligo classes used for species-specific transcript regional binding analysis. B) Conservation and binding of GGPS1 orthologous pairs. (left) Log2 enrichment values from nsRBNS for human bound (purple triangle), mouse not bound (light green circle), mouse bound (green open triangle), and human not bound (light purple open circle). (right) Alignment of human bound (purple triangle) to mouse not bound (light green circle) and mouse bound (green open triangle) to human not bound (purple open circle). Note: full oligos were used for alignment but only central region is shown. C-D) Cumulative distribution function of log2 enrichment of control (light grey; dotted), not bound (teal), bound elsewhere (purple), conserved (blue), and perfectly conserved (orange) C) all CDS and D) motif conserved CDS oligos. Insets show significance values for all comparisons via KS test and corrected for multiple comparisons via the BH procedure. Red denotes significant (p<=0.05). Values are as follows: a (ns), c (p<=0.05), d (p<=0.01), e (p<=0.001), f (p<=0.0001). E) Cumulative distribution function of RiboSeq log2 fold change separated via iCLIP detection and sequence conservation. F) Log2 fold change of mean base pair probability of the central region of “perfectly conserved,” “conserved,” “bound elsewhere,” and “not bound,” oligos normalized to UAG-containing CDS controls (see Methods). Error bars show standard error of the mean.
Figure 4.
Figure 4.. Analysis of regional impacts on binding.
A) Design of single and double chimera oligos. B) Design and box and whisker plot of normalized log2 enrichment (chimera/wt) for “UAG Change” single chimeras. Significance was determined via paired, one-sided Wilcox test and corrected for multiple comparisons via the BH procedure. Chimerization at positions 58–67 was found to be significant (p<=0.001). C) Design and box and whisker plot of normalized log2 enrichment (chimera/wt) for “Context Change” single chimeras. Significance was determined via paired, one-sample Wilcox test and corrected for multiple comparisons via the BH procedure. Following multiple comparison correction, no positions were determined to be significant. D) Heat map of median normalized log2 enrichment (chimera/wt) for “UAG Change” single and double chimeras. E) Fraction of “UAG Change” chimeras enhanced with binding after single (red) or double (grey, striped) chimerization. F) Heat map of median normalized log2 enrichment (chimera/wt) for “Context Change” single and double chimeras. G) Fraction of “Context Change” chimeras enhanced with binding after single (red) or double (grey, striped) chimerization. H) Log2 enrichment values from nsRBNS for human, mouse, single chimera, and double chimera GTPB4 at 5, 50, and 500 nM UNK. I) Fluorescence polarization binding curves for human GTPB4, mouse gtpb4, and chi58–67 gtpb4 RNA oligos incubated with UNK.
Figure 5.
Figure 5.. Evolutionary Conservation of Binding.
A) Simplified alignment of 100 vertebrate natural sequence alignment. B) Delta log2 enrichment, percent RNA sequence identity, percent UNK similarity (full length-grey and RBDs-green), and evolutionary distance in millions of years against 100 vertebrates for the aligned sequences from the top human bound oligos. Error bars show standard error of the mean (SEM). C) Mean percent RNA sequence identity (Y-axis) versus mean delta log2 enrichment (X-axis) for each aligned oligo. Pearson’s correlation coefficient included. D) Evolutionary distance in millions of years (Y-axis) versus mean delta log2 enrichment (X-axis) for each aligned oligo. Pearson’s correlation coefficient included. E) (left) Multiple sequence alignment for ATP1B1 for Homo sapiens, Mus musculus, Sus scrofa, Vicugna pacos, Tetradon nigroviridis, and Danio rerio with normalized enrichment by species. (right) Percent RNA sequence identity (Y-axis) versus normalized delta log2 enrichment (X-axis). Pearson’s correlation coefficient included. F) Scatter plot of log2 normalized UNK binding enrichment by evolutionary distance. X-axis plotted on log10 scale. Error bars show SEM. Data were separated by regulation as determined via RiboSeq where blue reflects UNK repression of translation [higher than average log2 fold change (>−0.9)] and red reflects lack of UNK repression [less than average log2 fold change (<−0.9)]. Significance was determined via KS test.
Figure 6.
Figure 6.. Models of RNA Binding.
A) Simple binding model: considers only primary motifs. B) Moderate binding model: considers primary and secondary motifs. C) Complex binding model: considers primary and secondary motifs as well as RNA secondary structure.

References

    1. Wagner A. Robustness and evolvability: a paradox resolved. Proc. Royal Soc. B: Biol. Sci. 275, 91–100 (2008). - PMC - PubMed
    1. Masel J. & Trotter M. V. Robustness and evolvability. Trends Genet. 26, 406–414 (2010). - PMC - PubMed
    1. King M.-C. & Wilson A. C. Evolution at two levels in humans and chimpanzees: Their macromolecules are so alike that regulatory mutations may account for their biological differences. Science 188, 107–116 (1975). - PubMed
    1. Britten R. J. & Davidson E. H. Gene Regulation for Higher Cells: A Theory: New facts regarding the organization of the genome provide clues to the nature of gene regulation. Science 165, 349–357 (1969). - PubMed
    1. Britten R. J. & Davidson E. H. Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty. The Q. review biology 46, 111–138 (1971). - PubMed

Publication types