Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 10:4:uqad012.
doi: 10.1093/femsml/uqad012. eCollection 2023.

Unraveling the small proteome of the plant symbiont Sinorhizobium meliloti by ribosome profiling and proteogenomics

Affiliations

Unraveling the small proteome of the plant symbiont Sinorhizobium meliloti by ribosome profiling and proteogenomics

Lydia Hadjeras et al. Microlife. .

Erratum in

Abstract

The soil-dwelling plant symbiont Sinorhizobium meliloti is a major model organism of Alphaproteobacteria. Despite numerous detailed OMICS studies, information about small open reading frame (sORF)-encoded proteins (SEPs) is largely missing, because sORFs are poorly annotated and SEPs are hard to detect experimentally. However, given that SEPs can fulfill important functions, identification of translated sORFs is critical for analyzing their roles in bacterial physiology. Ribosome profiling (Ribo-seq) can detect translated sORFs with high sensitivity, but is not yet routinely applied to bacteria because it must be adapted for each species. Here, we established a Ribo-seq procedure for S. meliloti 2011 based on RNase I digestion and detected translation for 60% of the annotated coding sequences during growth in minimal medium. Using ORF prediction tools based on Ribo-seq data, subsequent filtering, and manual curation, the translation of 37 non-annotated sORFs with ≤ 70 amino acids was predicted with confidence. The Ribo-seq data were supplemented by mass spectrometry (MS) analyses from three sample preparation approaches and two integrated proteogenomic search database (iPtgxDB) types. Searches against standard and 20-fold smaller Ribo-seq data-informed custom iPtgxDBs confirmed 47 annotated SEPs and identified 11 additional novel SEPs. Epitope tagging and Western blot analysis confirmed the translation of 15 out of 20 SEPs selected from the translatome map. Overall, by combining MS and Ribo-seq approaches, the small proteome of S. meliloti was substantially expanded by 48 novel SEPs. Several of them are part of predicted operons and/or are conserved from Rhizobiaceae to Bacteria, suggesting important physiological functions.

Keywords: Alphaproteobacteria; Ribosome profiling; Sinorhizobium meliloti; proteogenomics; proteomics; small open reading frame; small proteins.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflicts of interest.

Figures

Figure 1.
Figure 1.
Establishment of ribosome profiling (Ribo-seq) for Sinorhizobium meliloti. (A) Schematic Ribo-seq workflow to map the S. meliloti 2011 translatome. Translating ribosomes (indicated by the polysome fraction) were first captured on the mRNAs. Unprotected mRNA regions were digested by RNase I, converting polysomes to monosomes. Approximately 30-nt-long footprints protected by and co-purified with 70S ribosomes were then subjected to cDNA library preparation and deep sequencing to identify the translatome under the used conditions. The small proteome was identified using HRIBO automated predictions and manual curation. Mass spectrometry and Western blot analysis of recombinant, tagged small open reading frame (sORF)-encoded proteins were used to validate the translated sORFs. (B) Sucrose gradient fractionation of the lysates. Cells were harvested at the exponential growth phase by a fast-chilling method to avoid polysome run-off. RNase I digestion led to enrichment of monosomes (70S peak in the green profile) in contrast to the untreated sample (Mock, black profile). Absorbance at 254 nm was measured. (C) Integrated genome browser screenshots depicting reads from Ribo-seq and RNA-seq libraries for two annotated ORFs: rpsO encoding ribosomal protein S15 and icd encoding isocitrate dehydrogenase. They show read coverage enrichment in the Ribo-seq library along their coding parts in contrast to the RNA-seq library but not in the ribosome-non-protected regions (UTRs). The UTRs of rpsO are marked. (D) Read coverage for rnpB corresponding to the housekeeping RNase P RNA. Reads are mostly restricted to the RNA-seq library, suggesting that this RNA is not translated. (E) The fixN1OQP operon shows read coverage in both the RNA-seq library and Ribo-seq library, the latter indicating that this operon contains translated genes. Genomic locations and coding regions are indicated below the image. Bent arrow indicates the transcription start site based on (Sallet et al. 2013).
Figure 2.
Figure 2.
Ribosome profiling (Ribo-seq) captures the translatome of Sinorhizobium meliloti 2011 and reveals some features at the single-gene level. (A) Comparison of all annotated open reading frames (ORFs), annotated translated ORFs detected by Ribo-seq, and ORFs predicted to be translated by tools included in the HRIBO pipeline. To detect translation, we used the following parameters on the Ribo-seq data: TE of ≥ 0.5 and RNA-seq and Ribo-seq RPKM of ≥ 10. The numbers of ORFs per category are shown and represented by area size. Diagrams were prepared with BioVenn (www.biovenn.nl). (B) Scatter plot showing global TEs (TE = Ribo-seq/RNA-seq) computed from S. meliloti Ribo-seq replicates for all annotated coding sequences (CDS), annotated 5'- and 3'-UTRs, annotated housekeeping RNAs (hkRNA), annotated small RNAs (sRNAs) with (putative) regulatory functions, and annotated sORFs encoding proteins of ≤ 70 amino acids (aa). The purple lines indicate the mean TE for each transcript class. (C) Analysis of the two well-characterized sRNAs AbcR1 and AbcR2 by Ribo-seq. These two sRNAs show read coverage mostly in the RNA-seq library. (D) Ribo-seq reveals the active translation of the trpE leader peptide peTrpL (14 aa, encoded by the leaderless sORF trpL in the 5'-UTR (red arrow) and/or by the attenuator sRNA rnTrpL). In addition, the coverage of the Ribo-seq library shows that the biosynthetic gene trpE is translated in minimal medium, as expected. (E) Re-annotation of sORF SM2011_c05019 (50 aa). The GenBank 2014 annotation does not fit the RNA-seq and Ribo-seq read coverages. HRIBO predicts a shorter leaderless sORF (38 aa) that corresponds to the read coverage in both libraries. (F) Two ORFs missing from the GenBank 2014 annotation are revealed by Ribo-seq upstream of the nnrU gene related to denitrification. Genomic locations and coding regions are indicated below the image. Bent arrows indicate transcription start sites based on (Sallet et al. 2013).
Figure 3.
Figure 3.
Ribo-seq reveals translated annotated small open reading frames (sORFs) in Sinorhizobium meliloti 2011. (A) Venn diagrams showing the overlap between all annotated sORFs (259 sORFs, GenBank 2014), the sORFs detected as translated by Ribo-seq (benchmark set, TE of ≥ 0.5, RNA-seq and Ribo-seq RPKM of ≥ 10, and extensive manual curation), and sORFs predicted by the automated ORF prediction tools Reparation or DeepRibo. (B) Histogram showing the length distribution of the 85 annotated sORFs identified as translated by Ribo-seq in comparison with the 259 annotated sORFs. (C)Integrated genome browser screenshot depicting reads from the Ribo-seq and RNA-seq libraries for the annotated sORF pilA1 (60 amino acids, encoding a pilin subunit). The genomic position and the coding region are indicated below the image. Bent arrows indicate transcription start sites based on (Sallet et al. 2013). (D) Genomic context for the translated annotated sORFs relative to the annotated neighboring genes. (E) Start (left) and stop (right) codon usage of the translated annotated sORFs. (F) Replicon distribution of the translated annotated sORFs.
Figure 4.
Figure 4.
Ribo-seq uncovers a repertoire of small open reading frames (sORFs) missing from the Sinorhizobium meliloti 2011 genome annotation. (A) sORF predictions from HRIBO included a high number of potential non-annotated sORFs (approximately   15,000). These sORFs were first filtered (TE of ≥ 0.5, RNA-seq and Ribo-seq RPKM of ≥ 10, DeepRibo score of > −0.5) to generate a set of 266 translated sORF candidates that were additionally manually curated by inspection of the Ribo-seq read coverage in a genome browser. Overall, 54 high-confidence non-annotated sORFs displayed translation during growth in minimal medium. A Venn diagram shows the respective number of proteins from each category (scaled with area size). Diagrams were prepared with BioVenn ( www.biovenn.nl). (B) Histogram showing the length distribution of the 54 non-annotated versus the 85 annotated sORFs identified as translated by Ribo-seq. (C) Genomic context of the translated non-annotated sORFs. (D)Replicon distribution of the translated non-annotated sORFs.
Figure 5.
Figure 5.
Mass spectrometry-based identification of known and novel small open reading frame-encoded proteins (SEPs). (A) Experimental set-up for the proteomics analyses. Bacteria were grown in minimal and rich media, and protein extracts were further processed with tryptic in-solution digest (gray), solid-phase enrichment (SPE) of small proteins with subsequent Lys-C digestion (green), or without further digestion (blue). (B) Overlap of the identified SEPs by experimental approach; trypsin identified 45 SEPs; compared with the trypsin approach, Lys-C identified 38 SEPs (nine novel, 24%), and the approach without digestion found 30 SEPs (six novel, 20%). (C) Novel/unique identifications uncovered by the standard integrated proteogenomic search databases (iPtgxDB) and the small custom iPtgxDB. Standard iPtgxDB: Three peptides imply a 14 aa longer proteoform (60 aa) for HmuP than annotated; four peptides of the tmRNA-encoded proteolysis tag were identified; one peptide (3 peptide spectrum matches [PSMs]) implied a novel SEP (34 aa) internal to the genomic region that also encodes SM2011_b20335 but in a different frame. Spectra identifying these peptides are shown in Fig. S5. These identifications were also predicted by HRIBO based on Ribo-seq. Finally, six annotated proteins (GenBank 2014 and/or RefSeq 2017) were identified only in the search against the small custom iPtgxDB, as they did not accumulate enough spectral evidence in the search against the standard iPtgxDB (Table S4).
Figure 6.
Figure 6.
Detection of 15 sequential peptide affinity (SPA)-tagged small open reading frame-encoded proteins (SEPs) in Sinorhizobium meliloti crude lysates. (A) Schematic representation of the empty plasmid pSW2 (contains no promoter and no ribosome-binding site upstream of the linker [L] and SPA-encoding sequence) and a pSW2-SEP plasmid for the analysis of sORF translation. The constitutive PsinI promoter (hatched box), the corresponding TSS (flexed arrow), the sORF coding sequence with its −15-nt-long region, the SPA-tag (with its molecular size indicated) preceded by a linker (L) (gray boxes), and the Trrn terminator (hairpin) are depicted. (B)to (F) Western blot analysis of crude lysates (upper panels) and the corresponding Coomassie-stained gels, and (G)corresponding Ponceau-stained membrane for selected SEPs. Monoclonal FLAG-directed antibodies were used. Migration of marker proteins (in kDa) is shown on the left side. *Unspecific signal. Above the panels, the numbers of the analyzed SEP protein (Table S7), the presence (+) or absence (−) of a predicted TMH, and the molecular size (in kDa) of the SEP without the SPA tag are given. M: protein marker. C: empty vector control, lysate from a strain containing pSW2.
Figure 7.
Figure 7.
Conservation analysis, functional prediction and operon assignment for 48 novel small open reading frames (sORFs) of Sinorhizobium meliloti 2011. The conservation analysis was conducted using tBLASTn. The respective hits (see methods for parameters and cutoffs) are broadly summarized at the level of different taxonomic groups. The number of species outside the lower taxonomic unit, which harbors a hit, is given, if at < 10. In addition, the method by which the respective sORF was detected or confirmed is shown (Ribo-seq: ribosome profiling, MS: proteomics, WB: Western blot), as well as the results of predictions for membrane localization (by TMHMM and PSORTb), signal peptide II cleavage sites of lipoproteins (by LipoP), and function (by Phyre2; only hits with confidence levels greater than 30% are shown). For details on Phyre2 prediction and genomic context including operon prediction, see Table S8 and Table S9. sORF1 to sORF55 are a subset of the Ribo-seq-detected, translated sORFs, which are listed in Table S7, and sORF56 to sORF66 represent the novel sORFs identified by proteomics. sORFs encoding small proteins below 30 amino acids are shown in red. The putative sORF64, present in tmRNA, contains the proteolytic tag sequence. The sORF65 corresponds to the N-terminal HmuP extension; outside of Proteobacteria, it is conserved in many genera of Planctomycetes. *Structural genomics (92% confidence homology to protein of unknown function).
Figure 8.
Figure 8.
Translated sORF (SEP) candidates and their detection by different methods. Overlap between the 191 MS-detected SEP candidates (annotated and non-annotated), the 85 Ribo-seq-detected, manually curated sORFs present in the Genbank 2014 annotation and the 266 Ribo-seq-detected sORF candidates, which are missing in the GenBank 2014 annotation (Table S4). SEPs and translated sORFs, which are missing from both the GenBank 2014 and Refseq 2017 annotations, were designated ‘novel’. Two of the 11 Ribo-seq-detected novel sORFs are present in the RefSeq 2022 annotation (Table S4). Passing the stringent filtering criteria and (in the case of Ribo-seq) the manual curation, and detection by more than one method increases the confidence in sORF translation (for details see the master Table S4).

Comment in

Similar articles

Cited by

References

    1. Ahrens CH, Wade JT, Champion MM et al. A practical guide to small protein discovery and characterization using mass spectrometry. J Bacteriol. 2022;204:e0035321. - PMC - PubMed
    1. Allen RJ, Brenner EP, VanOrsdel CE et al. Conservation analysis of the CydX protein yields insights into small protein identification and evolution. BMC Genomics. 2014;15:946. - PMC - PubMed
    1. Aoyama JJ, Raina M, Zhong A et al. Dual-function Spot 42 RNA encodes a 15-amino acid protein that regulates the CRP transcription factor. Proc Natl Acad Sci USA. 2022;119:e2119866119. - PMC - PubMed
    1. Barra-Bily L, Fontenelle C, Jan G et al. Proteomic alterations explain phenotypic changes in Sinorhizobium meliloti lacking the RNA chaperone Hfq. J Bacteriol. 2010;192:1719–29. - PMC - PubMed
    1. Bartel J, Varadarajan AR, Sura T et al. Optimized proteomics workflow for the detection of small proteins. J Proteome Res. 2020;19:4004–18. - PubMed