Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 5;10(2):e02819-18.
doi: 10.1128/mBio.02819-18.

Identifying Small Proteins by Ribosome Profiling with Stalled Initiation Complexes

Affiliations

Identifying Small Proteins by Ribosome Profiling with Stalled Initiation Complexes

Jeremy Weaver et al. mBio. .

Abstract

Small proteins consisting of 50 or fewer amino acids have been identified as regulators of larger proteins in bacteria and eukaryotes. Despite the importance of these molecules, the total number of small proteins remains unknown because conventional annotation pipelines usually exclude small open reading frames (smORFs). We previously identified several dozen small proteins in the model organism Escherichia coli using theoretical bioinformatic approaches based on sequence conservation and matches to canonical ribosome binding sites. Here, we present an empirical approach for discovering new proteins, taking advantage of recent advances in ribosome profiling in which antibiotics are used to trap newly initiated 70S ribosomes at start codons. This approach led to the identification of many novel initiation sites in intergenic regions in E. coli We tagged 41 smORFs on the chromosome and detected protein synthesis for all but three. Not only are the corresponding genes intergenic but they are also found antisense to other genes, in operons, and overlapping other open reading frames (ORFs), some impacting the translation of larger downstream genes. These results demonstrate the utility of this method for identifying new genes, regardless of their genomic context.IMPORTANCE Proteins comprised of 50 or fewer amino acids have been shown to interact with and modulate the functions of larger proteins in a range of organisms. Despite the possible importance of small proteins, the true prevalence and capabilities of these regulators remain unknown as the small size of the proteins places serious limitations on their identification, purification, and characterization. Here, we present a ribosome profiling approach with stalled initiation complexes that led to the identification of 38 new small proteins.

Keywords: Ribo-seq; alternate ORFs; antisense; genome annotation; leader peptide; small protein.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Onc112 and retapamulin similarly trap ribosomes at start codons. (a) Ribosome density on the lpp gene from Onc112-treated (blue), retapamulin-treated (red) samples and an untreated control (gray). (Inset) Close-up view of the start site of lpp. (b) Average ribosome density at many genes aligned at their start sites in a sample treated with Onc112 and an untreated control. (c) Scatter plot of density at start sites in annotated genes in samples treated with Onc112 or retapamulin. The Spearman rank correlation is reported.
FIG 2
FIG 2
Using ribosome profiling data to discover new smORFs. (a) Flow chart showing the criteria used to identify smORFs in intergenic regions. (b) CDF plot showing the percentage of known, annotated smORFs (n = 44) (right y axis, black and gray) on the y axis less than or equal to the ribosome density near the start site (x axis) compared with candidate smORFs (n = 160,995) (left y axis, red and orange). Candidates with an average of >5 rpm were selected for further screening (broken line). (c) The proper spacing of ribosome density at start codons in treated samples helps to identify bona fide small protein-coding genes such as ORF22/yqgH. (d) In cases where several start codons could explain the ribosome density, spacing helps determine the correct site. ORF9/yhiY likely initiates with the second AUG codon of the three shown. (e) Many candidates were rejected because the start site does not align properly with the density observed.
FIG 3
FIG 3
Western analysis confirms synthesis of 95% of predicted small proteins tested. E. coli MG1655 strains with chromosomally tagged, putative smORFs were grown to exponential (E) and stationary (S) phase in rich media (LB). Gel samples were prepared to load equivalent numbers of cells based on OD600. Immunoblot analysis was conducted against the 3× FLAG motif included in the SPA tag using HRP-conjugated, anti-FLAG antibodies. Wild-type MG1655 was included as a negative control. Blots requiring a longer exposure to show tagged proteins have more background bands. Bands corresponding to small proteins are marked with an asterisk.
FIG 4
FIG 4
Observed small protein levels span several orders of magnitude. Stationary-phase samples grown in LB from Fig. 3 (black) were compared to each other and to similarly prepared samples of previously detected small proteins (gray) with the same chromosomal tag (17, 42). Immunoblot analysis for cells grown to stationary phase was conducted as described in the legend to Fig. 3 with E. coli MG1655 as a negative control. All samples are in the MG1655 background and equally loaded, except for AcrZ, where the sample was diluted 1:5. Ponceau S staining for the same region is shown below each immunoblot.
FIG 5
FIG 5
Novel smORFs (blue) are encoded antisense to known genes (gray). (a to d) Gene organization for the nrdB-yoaM (a), yqgC-yqgG (b), yghE-yqhJ (c), and waaL-yibX-yibY (d) loci. β-Galactosidase activity was assayed for cells carrying chromosomal fusions of the 5′ UTR and initial codons of yoaM fused to lacZ as well as out-of-frame control fusion (e), which were grown in rich media (LB) with 0.2% arabinose. (f to h) Protein levels for chromosomally SPA-tagged yqgC and yqgG (f), yghE and yqhJ (g) and waaL and yibX (h) genes. Gel samples were prepared from MG1655 strains grown to exponential (E) and stationary (S) phase in LB. Immunoblot analysis was conducted as described in the legend to Fig. 3 with MG1655 as a negative control. Bands corresponding to small proteins are marked with an asterisk, and bands corresponding to antisense-encoded larger proteins are marked with two asterisks.
FIG 6
FIG 6
smORFs are found in complex gene arrangements. (a to d) Gene organization for yhgO/yhgP (a), yriA/yriB (b), ybgU/ybgV (c), and mgtS/mgtT (d), with previously identified small protein genes in gray, newly identified small protein genes in blue, and small RNA gene mgrR in green. (e to h) Levels of corresponding proteins. Gel samples were prepared from MG1655 strains grown to exponential (E) and stationary (S) phase in LB. Immunoblot analysis was conducted as described in the legend to Fig. 3 with MG1655 as a negative control. Bands corresponding to small proteins are marked with an asterisk.
FIG 7
FIG 7
smORFs regulate expression of downstream genes. (a to e) Organization of smORFs (blue) in 5′ UTRs of known genes (gray). β-Galactosidase activity was assayed for cells carrying chromosomal fusions of the 5′ UTR and initial codons of ORF33 (f) and pssL (g) fused to lacZ. β-Galactosidase activity was assayed for cells carrying lacZ chromosomal fusions to the 5′ UTR and initial codons of the downstream gene with a wild-type start codon for the upstream smORF or with a stop codon replacing the start codon (f to j). For all β-galactosidase assays, cells were grown in LB with 0.2% arabinose.

References

    1. Storz G, Wolf YI, Ramamurthi KS. 2014. Small proteins can no longer be ignored. Annu Rev Biochem 83:753–777. doi: 10.1146/annurev-biochem-070611-102400. - DOI - PMC - PubMed
    1. Andrews SJ, Rothnagel JA. 2014. Emerging evidence for functional peptides encoded by short open reading frames. Nat Rev Genet 15:193–204. doi: 10.1038/nrg3520. - DOI - PubMed
    1. Saghatelian A, Couso JP. 2015. Discovery and characterization of smORF-encoded bioactive polypeptides. Nat Chem Biol 11:909–916. doi: 10.1038/nchembio.1964. - DOI - PMC - PubMed
    1. Hobbs EC, Yin X, Paul BJ, Astarita JL, Storz G. 2012. Conserved small protein associates with the multidrug efflux pump AcrB and differentially affects antibiotic resistance. Proc Natl Acad Sci U S A 109:16696–16701. doi: 10.1073/pnas.1210093109. - DOI - PMC - PubMed
    1. Wang H, Yin X, Wu Orr M, Dambach M, Curtis R, Storz G. 2017. Increasing intracellular magnesium levels with the 31-amino acid MgtS protein. Proc Natl Acad Sci U S A 114:5689–5694. doi: 10.1073/pnas.1703415114. - DOI - PMC - PubMed

Publication types

MeSH terms