Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Jan;37(1):103-12.
doi: 10.1002/bies.201400103. Epub 2014 Oct 24.

Identifying (non-)coding RNAs and small peptides: challenges and opportunities

Affiliations
Review

Identifying (non-)coding RNAs and small peptides: challenges and opportunities

Andrea Pauli et al. Bioessays. 2015 Jan.

Abstract

Over the past decade, high-throughput studies have identified many novel transcripts. While their existence is undisputed, their coding potential and functionality have remained controversial. Recent computational approaches guided by ribosome profiling have indicated that translation is far more pervasive than anticipated and takes place on many transcripts previously assumed to be non-coding. Some of these newly discovered translated transcripts encode short, functional proteins that had been missed in prior screens. Other transcripts are translated, but it might be the process of translation rather than the resulting peptides that serves a function. Here, we review annotation studies in zebrafish to discuss the challenges of placing RNAs onto the continuum that ranges from functional protein-encoding mRNAs to potentially non-functional peptide-producing RNAs to non-coding RNAs. As highlighted by the discovery of the novel signaling peptide Apela/ELABELA/Toddler, accurate annotations can give rise to exciting opportunities to identify the functions of previously uncharacterized transcripts.

Keywords: Apela/ELABELA/Toddler; coding potential; gene annotation; ncRNAs; peptides; short ORFs; zebrafish.

PubMed Disclaimer

Figures

Figure 1
Figure 1. A continuum from protein-coding to non-coding RNAs
Protein-coding transcripts (blue) and non-coding RNAs (ncRNAs, red) are at either end of the spectrum of translation. The transition zone in between is populated by transcripts with translated ORFs whose peptide products might not be functional. Illustrated here are only transcripts that are functional or potentially functional (non-functional transcripts are not included).
Figure 2
Figure 2. Overview of zebrafish transcript annotation pipelines
Outline of five zebrafish transcript annotation pipelines with their input data, strategies of classification and output data. Two pipelines focused on identifying non-coding transcripts [68, 69], one on testing and revising previous non-coding RNA predictions [13] and two on identifying uncharacterized protein-coding genes [14, 15]. Single asterisk (*): an ORF threshold < 30aa was used for transcripts mapping to genomic regions without alignments [69]. Double asterisk (**): 435 lncRNAs from [68] with sense overlapping transcripts in the Embryonic Transcriptome from [69]. smORF, short ORFs encoding a peptide < 100aa. For details see main text and Box 2.
Figure 3
Figure 3. Levels of gene annotation
At the most basic level the presence of a gene is indicated by evidence of expression of the locus or by computational prediction of its locus (top). The next level of annotation is the determination of a transcript's exon-intron structure, which is usually followed by the prediction of its coding potential (middle). The ultimate level of annotation is reached by discovering a gene's function (bottom). The computational methods (left) and experimental approaches (right) that can be used to reach each level of gene annotation are outlined.

Similar articles

Cited by

References

    1. Okazaki Y, Furuno M, Kasukawa T, Adachi J, et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002;420:563–73. - PubMed
    1. Bertone P, Stolc V, Royce TE, Rozowsky JS, et al. Global identification of human transcribed sequences with genome tiling arrays. Science. 2004;306:2242–6. - PubMed
    1. Carninci P, Kasukawa T, Katayama S, Gough J, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–63. - PubMed
    1. Kapranov P, Cheng J, Dike S, Nix DA, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–8. - PubMed
    1. ENCODE Project Consortium. Birney E, Stamatoyannopoulos JA, Dutta A, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed

Publication types

LinkOut - more resources