Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Apr;2(4):e23.
doi: 10.1371/journal.pgen.0020023. Epub 2006 Apr 28.

Pseudo-messenger RNA: phantoms of the transcriptome

Affiliations

Pseudo-messenger RNA: phantoms of the transcriptome

Martin C Frith et al. PLoS Genet. 2006 Apr.

Abstract

The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein-coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo-messenger RNAs (approximately half of which are transposon-associated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein-coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense-mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non-standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Number of ψmRNAs as a Function of Alignment E-Value Cutoff
Figure 2
Figure 2. Truncated ORFs of ψmRNAs and Potential for NMD
We chose either the ORF with the maximal number of codons (A) or the ORF whose in-frame overlap is earliest (B).
Figure 3
Figure 3. An Excess Number of ψmRNAs Have TGA Stop Codons as the Only Reading Frame Disruption
Figure 4
Figure 4. Number of Out-of-Frame Codons in Transcripts with Compensatory Frameshifts
Figure 5
Figure 5. Ratios of Synonymous and Nonsynonymous Substitution in Expressed Pseudogenes
(A and B) Distribution of d n /d s (A) and d s (B) for murine expressed pseudogenes. (C and D) Distribution of d n /d s (C) and d s (D) for non-redundant pairs of paralogous mouse protein-coding genes (youngest duplicate chosen for each gene). The d n/d s distribution of murine expressed pseudogenes suggests the existence of two subpopulations: one that has experienced protein-coding constraint and one that has not.
Figure 6
Figure 6. Similarity between Mouse Expressed Pseudogenes and Human Genomic Sequences
BLAST E-values are shown for murine expressed pseudogenes (line) and paired Swiss-Prot sequences (circles) versus human genome contigs. The total number of reported alignments is similar for both datasets; however, Swiss-Prot entries have consistently twice as many alignments with low E-values (0 < E < 10−10), and with E = 0, than do expressed pseudogenes. Alignments with E = 0 (132 and 63, respectively) are not visualised as points on the plot, but their respective quantities can be deduced from the shifts of both curves at the root.
Figure 7
Figure 7. ψmRNA Examples
(A) Ubiquitin-conjugating enzyme E2D pseudogene and ψmRNAs on mouse Chromosome 11. Protein homologies (blue track at top) of human UBE2D and mouse Ube2 paralogues to mouse genomic sequence indicate an unprocessed pseudogene (two frameshifts, see alignment panels, blue gene object). Mouse FANTOM mRNAs (brown tracks, with accession numbers) match genomic sequence without disruption (see alignment panels). No current mouse cDNAs or ESTs support the precise exon structure equivalent to the expressed human UBE2D gene: for example exon three (indicated) is larger than in the human orthologue. FANTOM mRNAs support two splice variants (red gene objects, numbers two and three from top). Two remaining splice variants are based on mouse EST matches (not shown). Pictures are from AceDB F-map (top) and Blixem (alignments, bottom), modified for clarity. (B) Compensatory frameshifts in FANTOM clone E030045F20 (AK053223) (brown track on the right) on mouse Chromosome 11, caused by splice variation in a gene homologous to human C22 orf3 at both the 5′ and 3′ end of an internal exon on mouse Chromosome 11. The more common variant is represented by FANTOM mRNA AK077457. The two transcripts (green gene objects on the left) have a different translation for the top exon shown here (blue and red highlights on the translations, for the objects marked with blue and red dots, respectively). Outlines on the DNA sequence show the respective exon boundaries and splice sites. The second exon shown is translated in the same reading frame for both variants (purple highlight), as are preceding and further downstream exons (not shown). A third variant, FANTOM mRNA BC062155, is a non-translating ψmRNA, as it has an out-of-frame alternative splice acceptor but lacks a compensating out-of-frame alternative splice donor site (red gene object on the left). Pictures are from AceDB F-map, modified for clarity.

Similar articles

Cited by

References

    1. Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005;308:1149–1154. - PubMed
    1. Frith MC, Pheasant M, Mattick JS. Genomics: The amazing complexity of the human transcriptome. Eur J Hum Genet. 2005;13:894–897. - PubMed
    1. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–1563. - PubMed
    1. Mighell AJ, Smith NR, Robinson PA, Markham AF. Vertebrate pseudogenes. FEBS Lett. 2000;468:109–114. - PubMed
    1. Balakirev ES, Ayala FJ. Pseudogenes: Are they “junk” or functional DNA? Annu Rev Genet. 2003;37:123–151. - PubMed

Publication types