Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 28;42(11):113311.
doi: 10.1016/j.celrep.2023.113311. Epub 2023 Oct 26.

Molecular and functional characterization of the Drosophila melanogaster conserved smORFome

Affiliations

Molecular and functional characterization of the Drosophila melanogaster conserved smORFome

Justin A Bosch et al. Cell Rep. .

Erratum in

Abstract

Short polypeptides encoded by small open reading frames (smORFs) are ubiquitously found in eukaryotic genomes and are important regulators of physiology, development, and mitochondrial processes. Here, we focus on a subset of 298 smORFs that are evolutionarily conserved between Drosophila melanogaster and humans. Many of these smORFs are conserved broadly in the bilaterian lineage, and ∼182 are conserved in plants. We observe remarkably heterogeneous spatial and temporal expression patterns of smORF transcripts-indicating wide-spread tissue-specific and stage-specific mitochondrial architectures. In addition, an analysis of annotated functional domains reveals a predicted enrichment of smORF polypeptides localizing to mitochondria. We conduct an embryonic ribosome profiling experiment and find support for translation of 137 of these smORFs during embryogenesis. We further embark on functional characterization using CRISPR knockout/activation, RNAi knockdown, and cDNA overexpression, revealing diverse phenotypes. This study underscores the importance of identifying smORF function in disease and phenotypic diversity.

Keywords: CP: Genomics; CRISPR; Drosophila; gene function; gene knockout; peptide; ribosome profiling; smORF.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Conservation of smORF dataset.
(A) Flowthrough of bioinformatic identification of 298 fly-human conserved smORFs (B) Number of conserved smORFs in dataset with and without homologs in a selection of species with well-annotated transcriptomes. (C) Multiple-species alignments of conserved smORFs, including mean amino acid hydrophobicity at each alignment position. bc10 (upper transcript) is encodes one smORF, whereas CG42497 and Tim10 (lower transcript) is polycistronic. See also Supplemental File 1 and Supplemental Figure S1.
Figure 2.
Figure 2.. Gene Ontology (GO) and KEGG enrichment analysis of conserved smORFs.
Significantly enriched GO terms for molecular function (“MF”), biological process (“BP”), cellular component (“CC”) are plotted. GO and KEGG enrichment analyses were performed with g:Profiler. Significantly enriched terms <10−5 are shown that also encompass all conserved smORFs classified as serine-type endopeptidase inhibitors and mitochondria-associated conserved smORFs. See also Supplemental File 2.
Figure 3.
Figure 3.. In situ mRNA hybridization patterns of mitochondria-associated smORFs.
(A) Clustering of mitochondria-associated conserved smORF in situ mRNA expression patterns. For each mitochondrial conserved smORF, the organs where expression patterns were assigned are represented by red boxes, while blue boxes represent no annotated expression. Expression patterns across embryo stages are collapsed. “DPPM” = dorsal prothoracic pharyngeal muscle; “MT” = Malpighian tubules; “VNC” = ventral nerve cord; “D and V epidermis” = dorsal and ventral epidermis. (B) In-situ mRNA hybridization images for each mitochondria-associated conserved smORF with patterned expression. Each image was taken between embryonic stages 13–16. Scale bar is 50μm. See also Supplemental Files 3 and 4.
Figure 4.
Figure 4.. Ribosome Profiling.
(A) Overview of ribosome profiling workflow, where polysomes are isolated followed by digestion of inter-ribosome RNA. Ribosome protected fragments are then collected. (B) After sequencing, the number of in frame reads are analyzed to determine if RPFs were successfully sequenced. Distribution of tags per million (TPM) for six, embryonic time periods. (C) Comparison of ribosome profiling sequencing to mRNA-seq sequencing showing ribosome profiling libraries are constrained to CDS while mRNA libraries map to the entire annotated transcript. REPTOR-BP encodes four small peptides, 93, 94, 117 and 118aa (blue boxes). These peptides share the same translation start site (arrowhead) and differ by the addition of a glutamine (indicated by the different splice sites in the second exon) and by carboxy termini (indicated by alternate splice sites and red bars). See also Supplemental Files 5 and 6, and Supplemental Figures S2 and S3.
Figure 5.
Figure 5.. Functional characterization of conserved smORFs by F1 CRISPR in vivo screening.
(A) Genetic cross to perform CRISPR somatic knockout in F1 generation. (B) Quantification of viability of F1 flies from 115 sgRNA-KO crosses. Number of F1 progeny counted per cross was 918>n>33. (C) Genetic cross to perform CRISPR gene overexpression in F1 generation. (D) Quantification of viability of F1 flies from 123 sgRNA-OE crosses. Number of F1 progeny counted per cross was 220>n>56. (E) Images of adult female flies aged seven days after eclosion for two indicated genotypes. Scalebar = 1mm (F) Quantification of viability of F1 flies from 68 UAS-cDNA crosses. Number of F1 progeny countered per cross was 706>n>101. See also Supplemental File 7 and Supplemental Figures S4 and S5.
Figure 6.
Figure 6.. Full CRISPR knockout of 25 uncharacterized smORFs raised on stressful foods.
Developmental timing of egg deposition to adult eclosure in smORF KO mutants raised on (A) control food, (B) high salt food (30% NaCl), (C) high fat food (30% coconut oil), (D) starvation food (30% food, 70% PBS+1%agar). Significance was determined by One Way Anova test followed by a Dunnet Post-Hoc, P=0.05(*), P=0.01(**), P=0.001(***), error bars calculated with SD. See also Supplemental File 7 and Supplemental Figure S6. Each genotype-foodtype experiment was carried out with at least biological replicates.

References

    1. Plaza S, Menschaert G, and Payre F (2017). In Search of Lost Small Peptides. Annu Rev Cell Dev Biol 33, 391–416. 10.1146/annurev-cellbio-100616-060516. - DOI - PubMed
    1. Couso JP, and Patraquim P (2017). Classification and function of small open reading frames. Nat Rev Mol Cell Biol 18, 575–589. 10.1038/nrm.2017.58. - DOI - PubMed
    1. Guerra-Almeida D, Tschoeke DA, and Nunes-da-Fonseca R (2021). Understanding small ORF diversity through a comprehensive transcription feature classification. DNA Res 28. 10.1093/dnares/dsab007. - DOI - PMC - PubMed
    1. Jain N, Richter F, Adzhubei I, Sharp AJ, and Gelb BD (2023). Small open reading frames: a comparative genetics approach to validation. BMC Genomics 24, 226. 10.1186/s12864-023-09311-7. - DOI - PMC - PubMed
    1. Martinez TF, Chu Q, Donaldson C, Tan D, Shokhirev MN, and Saghatelian A (2020). Accurate annotation of human protein-coding small open reading frames. Nat Chem Biol 16, 458–468. 10.1038/s41589-019-0425-0. - DOI - PMC - PubMed

Publication types