Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;140(13):2828-34.
doi: 10.1242/dev.098343. Epub 2013 May 22.

Ribosome profiling reveals resemblance between long non-coding RNAs and 5' leaders of coding RNAs

Affiliations

Ribosome profiling reveals resemblance between long non-coding RNAs and 5' leaders of coding RNAs

Guo-Liang Chew et al. Development. 2013 Jul.

Abstract

Large-scale genomics and computational approaches have identified thousands of putative long non-coding RNAs (lncRNAs). It has been controversial, however, as to what fraction of these RNAs is truly non-coding. Here, we combine ribosome profiling with a machine-learning approach to validate lncRNAs during zebrafish development in a high throughput manner. We find that dozens of proposed lncRNAs are protein-coding contaminants and that many lncRNAs have ribosome profiles that resemble the 5' leaders of coding RNAs. Analysis of ribosome profiling data from embryonic stem cells reveals similar properties for mammalian lncRNAs. These results clarify the annotation of developmental lncRNAs and suggest a potential role for translation in lncRNA regulation. In addition, our computational pipeline and ribosome profiling data provide a powerful resource for the identification of translated open reading frames during zebrafish development.

Keywords: ES cells; Embryogenesis; Long non-coding RNAs; Ribosome profiling; Zebrafish.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of lncRNA classification pipeline. (A,B) High-throughput sequencing data (ribosome profiling and RNA-seq) from eight early developmental stages (A) is used to train a classifier with RefSeq coding sequences (CDSs), 5′ leaders and 3′ trailers (B). (C) The translated ORF classifier (TOC) uses ribosome profiles and gene expression levels to classify putative lncRNAs as protein-coding (blue), leader-like (green) or trailer-like (red).
Fig. 2.
Fig. 2.
Ribosome profiles outline translated ORFs of coding genes. (A) Representative examples of ribosome-protected fragment (RPF) densities associated with protein-coding genes. Gene structures are depicted as thick bars for the coding sequence (CDS), thin bars for 5′ leaders and 3′ trailers, and dashed lines for introns. Note that the majority of RPFs map within the CDSs and are flanked by the annotated initiation (START, green) and termination codon (STOP, red). The bottom three panels show examples of uORF-containing genes. For these genes, RPF reads map to the CDSs and to short ORFs within the 5′ leaders. (B) RefSeq metagene analysis of relative phasing of ribosome P-sites (see Materials and methods). Relative phasing is defined as the number of RPFs at a given position divided by the mean of the number of RPFs at the four adjacent positions. i.e. relative phasing at position i=RPFs at position i/mean (RPFs at positions i-2, i-1, i+1 and i+2). As in previous studies (Ingolia et al., 2011), triplet phasing of ribosome profiles was observed.
Fig. 3.
Fig. 3.
TOC distinguishes ORFs in 5′ leaders, CDSs and 3′ trailers. (A) A training set is constructed from RefSeq genes using (1) annotated CDSs (coding ORFs, blue) in the context of the whole transcript, (2) RPF-containing ORFs in the 5′ leader sequence (green) in the context of the 5′ leader, and (3) RPF-containing ORFs in the 3′ trailer (red) in the context of the 3′ trailer (see Materials and methods). The four metrics used to train the classifier are displayed in the gray box (TE, translational efficiency; IO, inside versus outside; FL, fragment length; DS, disengagement score). After training, TOC uses RPF-covered ORFs to classify transcripts. (B) The combination of the four metrics separates coding ORFs, leaders and trailers of the training set. Transcripts lacking a protein-coding ORF cluster with trailers and leaders of the training set, as shown for three validated zebrafish lncRNAs (black). The density of each measure is shown along the axes.
Fig. 4.
Fig. 4.
TOC refines classification of lncRNAs. (A) TOC-based classification improves previous lncRNA predictions. Shown are RNA-seq and ribosome profiling read densities associated with three putative lncRNAs (Ulitsky et al., 2011), which had conflicting annotations in published zebrafish lncRNA sets (Pauli et al., 2012; Ulitsky et al., 2011). Transcript structures are shown in black. Introns are indicated as dashed lines. The region scoring highest in PhyloCSF (Lin et al., 2011) is indicated in orange. Whereas TOC reveals the protein-coding nature of linc-ca2, it confirms the non-coding nature of the two conserved lncRNAs megamind and cyrano. These two lncRNAs had been filtered out in the Pauli et al. lncRNA set owing to their relatively high phylogenetic codon substitution frequency scores (PhyloCSF >20). (B) Fraction of loci that are classified by TOC as coding (blue), leader-like (green) and trailer-like (red) in three collections of lncRNAs: ZF1 (Pauli et al., 2012), ZF2 (Ulitsky et al., 2011) and mESCs (Guttman et al., 2011).

References

    1. Arribere J. A., Gilbert W. V. (2013). Roles for transcript leaders in translation and mRNA decay revealed by transcript leader sequencing. Genome Res. [Epub ahead of print] doi: 10.1101/gr.150342.112 - PMC - PubMed
    1. Bánfai B., Jia H., Khatun J., Wood E., Risk B., Gundling W. E., Jr, Kundaje A., Gunawardena H. P., Yu Y., Xie L., et al. (2012). Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 - PMC - PubMed
    1. Barlow D. P., Stöger R., Herrmann B. G., Saito K., Schweifer N. (1991). The mouse insulin-like growth factor type-2 receptor is imprinted and closely linked to the Tme locus. Nature 349, 84–87 - PubMed
    1. Bartolomei M. S., Zemel S., Tilghman S. M. (1991). Parental imprinting of the mouse H19 gene. Nature 351, 153–155 - PubMed
    1. Bazzini A. A., Lee M. T., Giraldez A. J. (2012). Ribosome profiling shows that miR-430 reduces translation before causing mRNA decay in zebrafish. Science 336, 233–237 - PMC - PubMed

Publication types