Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation
- PMID: 24705786
- PMCID: PMC4193932
- DOI: 10.1002/embj.201488411
Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation
Abstract
Identification of the coding elements in the genome is a fundamental step to understanding the building blocks of living systems. Short peptides (< 100 aa) have emerged as important regulators of development and physiology, but their identification has been limited by their size. We have leveraged the periodicity of ribosome movement on the mRNA to define actively translated ORFs by ribosome footprinting. This approach identifies several hundred translated small ORFs in zebrafish and human. Computational prediction of small ORFs from codon conservation patterns corroborates and extends these findings and identifies conserved sequences in zebrafish and human, suggesting functional peptide products (micropeptides). These results identify micropeptide-encoding genes in vertebrates, providing an entry point to define their function in vivo.
Figures

Schematic representation of ribosome profiling: 28 to 29-nt-long ribosome-protected fragments (RPFs) are generated from nuclease digestion, where the P-site of the ribosome is in position 13.
Developmental stages at which ribosome profiling was performed.
Subcodon position of the ribosome footprints (position 13) for the RPF and input reads. Plot shows the proportion of RPFs or input reads aligned to the coding sequence of RefSeq genes at each position relative to the codon. Input reads were obtained after poly-(A) fractionation and random fragmentation of the naked RNA.
RPFs and input reads mapped to a composite RefSeq transcript. RPFs mainly map to the CDS with a 3-nucleotide periodicity. RPF reads are colored as in (C) based on the position with respect to the frame of the CDS. Input reads map to both the UTRs and CDS (gray).
Subcodon profile plot showing RPF and input reads aligned to actinb1. Reads are colored based on the frame (1, 2 or 3) position relative to the transcript (Michel et al, 2012). All putative ORFs (distal AUG-Stop) were also colored for each respective frame (blue, pink and green boxes). Note that most of the RPFs from the annotated ORF match the color of the box, consistent with a strong in-frame distribution of reads within individual transcripts.

Workflow to define the ORFscore: Top diagram represents a transcript, below solid bars represent all possible ORFs (Distal AUG-Stop) identified in each frame (+1, +2, +3). The RPF distribution in each frame is compared to an equally sized uniform distribution using a modified chi-squared statistic (see Materials and Methods). The resulting ORFscore is assigned a negative value when the distribution of RPFs is inconsistent with the frame of the CDS.
Coverage is determined by measuring the proportion of in-frame CDS positions with ≥ 1 reads.



Comment in
-
Everything old is new again: (linc)RNAs make proteins!EMBO J. 2014 May 2;33(9):937-8. doi: 10.1002/embj.201488303. Epub 2014 Apr 9. EMBO J. 2014. PMID: 24719208 Free PMC article.
Similar articles
-
Translation of Small Open Reading Frames: Roles in Regulation and Evolutionary Innovation.Trends Genet. 2019 Mar;35(3):186-198. doi: 10.1016/j.tig.2018.12.003. Epub 2018 Dec 31. Trends Genet. 2019. PMID: 30606460 Review.
-
Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs.BMC Genomics. 2013 Sep 23;14:648. doi: 10.1186/1471-2164-14-648. BMC Genomics. 2013. PMID: 24059539 Free PMC article.
-
De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data.J Vis Exp. 2022 Feb 18;(180). doi: 10.3791/63366. J Vis Exp. 2022. PMID: 35253791
-
Genome-Wide Analysis of Actively Translated Open Reading Frames Using RiboTaper/ORFquant.Methods Mol Biol. 2021;2252:331-346. doi: 10.1007/978-1-0716-1150-0_16. Methods Mol Biol. 2021. PMID: 33765284
-
Identifying (non-)coding RNAs and small peptides: challenges and opportunities.Bioessays. 2015 Jan;37(1):103-12. doi: 10.1002/bies.201400103. Epub 2014 Oct 24. Bioessays. 2015. PMID: 25345765 Free PMC article. Review.
Cited by
-
Global Analysis of Truncated RNA Ends Reveals New Insights into Ribosome Stalling in Plants.Plant Cell. 2016 Oct;28(10):2398-2416. doi: 10.1105/tpc.16.00295. Epub 2016 Oct 14. Plant Cell. 2016. PMID: 27742800 Free PMC article.
-
Diverse regulatory interactions of long noncoding RNAs.Curr Opin Genet Dev. 2016 Feb;36:73-82. doi: 10.1016/j.gde.2016.03.014. Epub 2016 May 3. Curr Opin Genet Dev. 2016. PMID: 27151434 Free PMC article. Review.
-
Ribosome profiling: a powerful tool in oncological research.Biomark Res. 2024 Jan 25;12(1):11. doi: 10.1186/s40364-024-00562-4. Biomark Res. 2024. PMID: 38273337 Free PMC article. Review.
-
Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome.Genome Biol. 2024 Jul 8;25(1):183. doi: 10.1186/s13059-024-03287-7. Genome Biol. 2024. PMID: 38978079 Free PMC article.
-
Translated Long Non-Coding Ribonucleic Acid ZFAS1 Promotes Cancer Cell Migration by Elevating Reactive Oxygen Species Production in Hepatocellular Carcinoma.Front Genet. 2019 Nov 12;10:1111. doi: 10.3389/fgene.2019.01111. eCollection 2019. Front Genet. 2019. PMID: 31781169 Free PMC article.
References
-
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed
Publication types
MeSH terms
Substances
Associated data
- Actions
Grants and funding
- R01GM103789-01/GM/NIGMS NIH HHS/United States
- R01 GM097194/GM/NIGMS NIH HHS/United States
- R01 GM095982/GM/NIGMS NIH HHS/United States
- R01HD074078-02/HD/NICHD NIH HHS/United States
- R01GM081602-06/GM/NIGMS NIH HHS/United States
- F32 HD071697/HD/NICHD NIH HHS/United States
- R01 GM103789/GM/NIGMS NIH HHS/United States
- R01 GM101108/GM/NIGMS NIH HHS/United States
- UL1 TR000142/TR/NCATS NIH HHS/United States
- R01GM095982/GM/NIGMS NIH HHS/United States
- R01 HD074078/HD/NICHD NIH HHS/United States
- F32HD071697-02/HD/NICHD NIH HHS/United States
- R01GM097194/GM/NIGMS NIH HHS/United States
- T32GM007499/GM/NIGMS NIH HHS/United States
- R01 GM081602/GM/NIGMS NIH HHS/United States
- T32 GM007499/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases