Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 29;13(1):3751.
doi: 10.1038/s41467-022-31327-y.

Pervasive translation of circular RNAs driven by short IRES-like elements

Affiliations

Pervasive translation of circular RNAs driven by short IRES-like elements

Xiaojuan Fan et al. Nat Commun. .

Abstract

Some circular RNAs (circRNAs) were found to be translated through IRES-driven mechanism, however the scope and functions of circRNA translation are unclear because endogenous IRESs are rare. To determine the prevalence and mechanism of circRNA translation, we develop a cell-based system to screen random sequences and identify 97 overrepresented hexamers that drive cap-independent circRNA translation. These IRES-like short elements are significantly enriched in endogenous circRNAs and sufficient to drive circRNA translation. We further identify multiple trans-acting factors that bind these IRES-like elements to initiate translation. Using mass-spectrometry data, hundreds of circRNA-coded peptides are identified, most of which have low abundance due to rapid degradation. As judged by mass-spectrometry, 50% of translatable endogenous circRNAs undergo rolling circle translation, several of which are experimentally validated. Consistently, mutations of the IRES-like element in one circRNA reduce its translation. Collectively, our findings suggest a pervasive translation of circRNAs, providing profound implications in translation control.

PubMed Disclaimer

Conflict of interest statement

Z.W. and Y.Y. has co-founded a company, CirCode Biotech, to commercialize the application of circular RNA as template of protein production/expression, and applied a patent to use circRNA as a gene expression platform. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Extensive IRES-like elements can drive circRNA translation.
Source data are provided in a Source Data file for panels (eh). a Random decamers were inserted into pcircGFP-BsmBI, and the resulting library was transfected into 293T cells and the green cells sorted by FACS. The inserted sequences were recovered with RT-PCR and sequencing. The primers for RNA-seq library production were indicated by blue arrows. The enriched hexamers were identified computationally. b Flow-cytometry of cells transfected with circRNA reporter containing the 10-mer library. The cells were classified into four groups based on their GFP fluorescence (GFP negative, low, medium, or high GFP cells). The cells with medium and high fluorescence were sorted as “green cells”. c Single nucleotide frequency in the starting library (top) and the sequences enriched in green cells (bottom). d Frequencies and odd ratios of the dinucleotide in the sequences enriched in green cells. The odd ratio is defined as the probability of a dinucleotide divided by the product of the probabilities of each base. e The 97 enriched hexamers (i.e., IRES-like elements, z score >7) were clustered into 11 groups with the consensus motifs shown as pictogram (top). Representative hexamers in each cluster were inserted back into the circRNA reporter, which were transiently transfected into 293T cells. The GFP was assayed by western blot at 48 h after transfection (bottom). f The 122 depleted hexamers (i.e., negative control, z score < -7) were clustered into 11 groups and the consensus motifs were shown as pictogram (top). Representative hexamers in all clusters were tested using the same condition as panel (e), with AAAAAA as the positive control. g Comparison of newly identified IRES-like elements with m6A sites (RSV and m6A) for the activity to drive circRNA translation using the same condition as panel (e). h Effects of neighboring sequence on circRNA translation. The enriched and depleted hexamers were inserted into circRNA reporters with or without ATC trimer (partially resemble Kozak sequence). The samples were analyzed using the same condition as panel (e). The circRNA reporters inserted with poly-A sequence were loaded twice as control.
Fig. 2
Fig. 2. CircRNAs contain many IRES-like elements that initiate translation.
Source data are provided as a Source Data file for panels (b) and (c). a IRES-like elements are significantly enriched in circRNAs. Average frequencies of different types of hexamers (all hexamers, IRES-like hexamers and depleted hexamers) in the internal exons of linear mRNAs vs. circRNAs were plotted. N: the number of RNA sequences in each dataset. The p-values were calculated with two-sided Kolmogorov–Smirnov test. b Translation of circRNAs can be initiated by internal coding sequence. Left panel, schematic diagrams of expression reporters for a linear mRNA and four circRNAs that code for GFP. From top to bottom: linear GFP mRNA; circRNA with an m6A site at upstream of the start codon; circRNA with an upstream m6A site and no stop codon; circRNA with start codon immediately following the stop codon; circRNA containing only the coding sequence (no stop codon or UTR). Red line, stop codon. The circRNA plasmids were transfected into 293T cells, and samples were analyzed by western blot at 2 days after transfection (right panel). The level of GAPDH was measured as a loading control. The green arrow indicates the full-length GFP, and the asterisks indicate the truncated GFP proteins translated using internal start codons. The bar graph represents the quantification of GFP protein levels relative to GAPDH. The protein levels were also normalized to the RNA. c Translation of the circRNA-coded Renilla luciferase (Rluc) using internal IRES-like elements. Top: schematic diagram of two Rluc circRNAs, where the Rluc ORF was split into two parts so that the full-length Rluc ORF can only be generated through back-splicing. The cRluc contains a sequence coding a T2A peptide by which the translation product can be cleaved into full length Rluc protein. rcRluc does not contain stop codon. Bottom: dual luciferase assay of cRluc and rcRluc. Control and circular Rluc plasmids were co-transfected with Fluc (firefly luciferase) reference reporter into 293T cells. The cells were lysed at 48 h after transfection for luminescence measurement using luminescence reader, and the relative luminescence signals were plotted.
Fig. 3
Fig. 3. Systematically identification of trans-factors that recognize IRES-like elements.
Source data are provided as a Source Data file for panels (d) and (e). a Schematic diagram of RNA affinity purification. Biotin-labeled RNAs containing consensus motifs of IRES-like elements were incubated with HeLa cell lysate, and RNA-protein complexes were purified by streptavidin beads. The proteins were further identified by mass spectrometry. b Identification of trans-factors bound by each RNA probe. The probes presenting five consensus motifs of IRES-like elements (red) and a control probe (blue) were used (full sequence in Supplementary Data 2). The total eluded proteins were separated with SDS-PAGE, and each band was cut and analyzed by mass spectrometry. The top three identified proteins in each band were labeled. c Protein–protein interaction network of identified trans-factors. Top proteins bound by all RNA probes (i.e., IRES-like elements) were analyzed by STRING and clustered into two main groups by MCODE tool. d Validating the activity of trans-factors. The circRNA reporter inserted with two depleted hexamers in tandem were co-transfected into 293T cells with different Puf-fusion proteins that specifically recognize an 8-nt target in the inserted sequences. The resulting cells were collected at 2 days after transfection to analyze by western blot and RT-PCR. Different pairs of Puf proteins and 8-nt targets were used as specificity control. Puf-N2 specifically binds AGUGUCAG, whereas Puf-N8 specifically binds to GCGUCUGC. The bar graph represents the quantification of GFP levels relative to GAPDH. The protein levels were also normalized to the RNA (N.S. not significant). e Validation of PABPC1 activity. The expression vector of PABPC1 and various control RBPs were co-transfected with circRNA translation reporter containing (A)10 or (G)10 sequences before the start codon, and the protein products were assayed at 48 h after transfection. The bar graph represents the quantification of GFP levels relative to GAPDH using the western blot. The protein levels were also normalized to the RNA, and the relative amount of GFP translation was calculated by dividing the mock transfection.
Fig. 4
Fig. 4. Identification of circRNA-coded proteins.
a Position distribution for the first circRNA exon in their host genes. Full-length circRNA sequences were analyzed based on previous datasets, and the histogram was plotted according to the exon number of the first circRNA exon in host genes. The inserted pie chart presents the percent of circRNAs overlapping with different mRNA regions. b Survey of potential circRNA-coded products. Left, percent of endogenous circRNAs with an ORF > 20 aa. Purple pie slices: circRNAs translated regularly (cORF, dark purple) or in a rolling circle fashion (rcORF, light purple) into proteins that are partially overlapped with their host genes for at least 7aa; brown slices: circRNAs translated in a regular fashion (cORF, dark brown) or a rolling circle fashion (rcORF, light brown) into proteins that are not overlapped with their host genes but are homologous to other known proteins; blue pie slices: circRNAs translated into proteins that are not homologous to any known protein (cORF and rcORF combined); gray slices: circRNAs do not contain any potential ORF longer than 20 aa. Right panels: the same analysis of putative coding products from two control circRNA sets: reversed sequences or randomly shuffled sequences of the endogenous circRNAs. c Schematic diagram to identify circRNA-coded proteins using proteomic datasets. d Left, computational filters sequentially applied to identify translatable circRNAs and the numbers of circRNAs passing each filter. Right, the percentage of different types of circRNA-coded ORFs in the circRNAs passing each filter. The definition of different circRNA-coded ORFs is the same as panel (b). e Distribution of the supporting spectra for each translatable circRNA. f Distribution of the number of cell lines and tissues for each translatable circRNA in two proteomic datasets. g Comparison of the numbers of spectra from linear mRNAs vs. circRNAs. Abbreviations of different tissues and cell lines are listed in Supplementary Data 4. The proteomic data from Bekker-Jensen, DB, et al (blue) and Kim, MS, et al. & Pinto, SM, et al (red) were used. Green words indicate the 39 fractions, 46 fractions and 70 fractions from HeLa cells using high-capacity offline HpH reversed-phase LC.
Fig. 5
Fig. 5. Rolling circle translation of endogenous circRNAs.
Source data are provided as a Source Data file for panels (cg). a The higher-energy collisional dissociation MS/MS spectrum of the peptide across the back-splice junctions of the human circPSAP (MMMHMEEILVYLEK), circPFAS (LLEVGPRNL), and circABHD12 (LPRILSVK). The annotated b- and y-ions are marked in red and green, respectively. b The rcORF translation reporters. The coding region of the endogenous rcORF was inserted into a back-splicing reporter, with an in-frame V5 epitope for detection. c Three rcORF reporters were transfected into 293 cells, the cells were collected at 48 h after transfection and analyzed by western blotting and RT-PCR. The blue arrows represent the predicted MW of the single cycle of translation product from cPSAP (22.4kD), cPFAS (15.6kD), or cABHD12 (10.5kD). d The rcORF translation reporters were transfected into 293T cells and then treated with 10 µM MG132 for 2 h, or 10 µM chloroquine for 4 h before cell collection. The bar graph represents the quantification of protein levels relative to GAPDH, which were also normalized to the RNA (N.S. not significant). e The circPFAS contains two IRES-like hexamers (AATTCA and AAGAAG), which were mutated into neutral sequences (mut1 and mut2). The downstream AUG codons were mutated into CUC (Start Mut #1 and #2), and the non-canonical start codons CUG at downstream of the AAGAAG hexamers were further mutated into CUC (Start Mut #3). The effects of these mutations on protein production were determined with western blotting using similar procedure as panel c, with relative protein changes represented by bar graphs (N.S. not significant). f The back-splicing reporter of rcPFAS was co-transfected into 293T cells with the expression vectors of various trans-acting factors that bind to the newly identified IRES-like elements. The cells were collected and analyzed using same procedures as panel (c). g The circRNA with mutated IRES-like hexamer AAGAAG (Mut2) was co-expressed with the same set of trans-acting factors, and the production of rolling circle translation were measured using same experimental conditions described in panel (f).

References

    1. Chen LL. The biogenesis and emerging roles of circular RNAs. Nat. Rev. Mol. Cell Biol. 2016;17:205–211. doi: 10.1038/nrm.2015.32. - DOI - PubMed
    1. Barrett SP, Salzman J. Circular RNAs: analysis, expression and potential functions. Development. 2016;143:1838–1847. doi: 10.1242/dev.128074. - DOI - PMC - PubMed
    1. Li X, Yang L, Chen LL. The biogenesis, functions, and challenges of circular RNAs. Mol. Cell. 2018;71:428–442. doi: 10.1016/j.molcel.2018.06.034. - DOI - PubMed
    1. Jeck WR, et al. Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA. 2013;19:141–157. doi: 10.1261/rna.035667.112. - DOI - PMC - PubMed
    1. Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One. 2012;7:e30733. doi: 10.1371/journal.pone.0030733. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances