Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;22(7):1231-42.
doi: 10.1101/gr.130062.111. Epub 2012 May 15.

Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts

Affiliations

Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts

Milana Frenkel-Morgenstern et al. Genome Res. 2012 Jul.

Abstract

Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Expression levels of genes and chimeric transcripts in humans. The expression of genes in human tissues ranges from 0.001 to 15,700 RPKM, with a median of 0.588, whereas the expression of chimeras ranges from 0.006 to 17.8 RPKM with a median of 0.02. Most of the genes involved in the formation of chimeras are moderately to highly expressed, as their expression ranges from 0 to 2495 RPKM, with a median of 12.6. The trend is also observed for the translated chimeras and their parental genes (Wilcoxon test, P-value < 5 × 10−6). The whiskers of the boxplot extend to the data extremes (see also Supplemental Fig. S4).
Figure 2.
Figure 2.
A density plot of RPKM expression levels for all genes versus chimeras. The total number of chimeras is much lower than the total number of genes. Hence, the densities of the distributions are plotted and not the counts. The height of the bars does not correspond to number of transcripts, but to the proportion of transcripts in a given expression category. The distribution for all genes is bimodal, with chimeras falling in the low expressed genes distribution.
Figure 3.
Figure 3.
Tissue specificity of all genes versus chimeras. All genes are presented in red and chimeras in blue. The expression of chimeras is more tissue specific across the different expression levels (ANCOVA, P-value < 7.7 × 10−13). The bins are chosen so as to cover the expression range of all chimeras and have an equal number of chimeras per bin.
Figure 4.
Figure 4.
A chimera with confirmed RNA and protein expression. We detected two overlapping unique peptides that matched the junction site in 18 mass spectrometry experiments and by the targeted mass spectrometry (SRM) analysis, confirming that this transcript (ESTid = “BM838228.1”) from ChimerDB (Kim et al. 2010) is expressed at the protein level. (A) The 3D structure of the chimeric protein is modeled by Phyre2 (Kelley and Sternberg 2009). (Green) The chimeric protein part derived from actin, ACTG1, predicted using homology modeling; (red) the part of the ribosomal protein, RPL13A, predicted using ab initio methods. The structure is modeled using the Ribonuclease H-like motif fold (actin-like ATPase domain) with 100% confidence and 85% identity. (B) The secondary structure modeling by Phyre2 (Kelley and Sternberg 2009) predicts that a highly preserved beta strand appearing in the wild-type actin protein should also feature in the chimera (blue rectangle). The motif “GDGV” (red rectangle) is the ATP-binding site, which is missing in the chimera sequence.
Figure 5.
Figure 5.
Selective reaction monitoring (SRM) mass spectrometry analysis. The peptide VISSIEQKTMAAPSVK at the junction site of the BF969911.1 chimera was confirmed by SRM analysis using a stable isotope labeled standard. Briefly, a peptide of the same amino acid sequence was synthesized with a heavy lysine residue, which was then spiked into the digested human prostate cancer lysate. The mixture was fractionated by high pH reversed phase liquid chromatography and the fractions analyzed by SRM mass spectrometry. On the basis of the concentration of the labeled standard, the chimera was estimated to be present at a concentration of ∼30 fmol/mL. A signal-to-noise ratio was calculated as root-mean-square (RMS).
Figure 6.
Figure 6.
Putative chimeric proteins often contain the signal peptides or TM domains of the parental proteins. (A) Schematic view of the two proteins participating in the human chimera: thioredoxin domain containing protein 5 (TXNDC5) and lysophosphatidylcholine acyltransferase 2 (LPCAT2). (B) Schematic view of the hypothetical chimera comprising the signal peptide of TXNDC5 and two TM domains of LPCAT2. We predict that this chimera is localized in the ER lumen.

Comment in

References

    1. Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, Spiegelman CH, Zimmerman LJ, Ham AJ, Keshishian H, et al. 2009. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat Biotechnol 27: 633–641 - PMC - PubMed
    1. Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R, Novik A, Sorek R 2006. Transcription-mediated gene fusion in the human genome. Genome Res 16: 30–36 - PMC - PubMed
    1. Allen MA, Hillier LW, Waterston RH, Blumenthal T 2011. A global analysis of C. elegans trans-splicing. Genome Res 21: 255–264 - PMC - PubMed
    1. Altschul S, Madden T, Schaffer A, Zhang J, Zhang Z, Miller W, Lipman D 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402 - PMC - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL 2005. GenBank. Nucleic Acids Res 33: D34–D38 - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources