Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul;114(1):93-108.
doi: 10.1111/mmi.14498. Epub 2020 Apr 9.

Polycysteine-encoding leaderless short ORFs function as cysteine-responsive attenuators of operonic gene expression in mycobacteria

Affiliations

Polycysteine-encoding leaderless short ORFs function as cysteine-responsive attenuators of operonic gene expression in mycobacteria

Jill G Canestrari et al. Mol Microbiol. 2020 Jul.

Abstract

Genome-wide transcriptomic analyses have revealed abundant expressed short open reading frames (ORFs) in bacteria. Whether these short ORFs, or the small proteins they encode, are functional remains an open question. One quarter of mycobacterial mRNAs are leaderless, beginning with a 5'-AUG or GUG initiation codon. Leaderless mRNAs often encode unannotated short ORFs as the first gene of a polycistronic transcript. Here, we show that polycysteine-encoding leaderless short ORFs function as cysteine-responsive attenuators of operonic gene expression. Detailed mutational analysis shows that one polycysteine short ORF controls expression of the downstream genes. Our data indicate that ribosomes stalled in the polycysteine tract block mRNA structures that otherwise sequester the ribosome-binding site of the 3'gene. We assessed endogenous proteomic responses to cysteine limitation in Mycobacterium smegmatis using mass spectrometry. Six cysteine metabolic loci having unannotated polycysteine-encoding leaderless short ORF architectures responded to cysteine limitation, revealing widespread cysteine-responsive attenuation in mycobacteria. Individual leaderless short ORFs confer independent operon-level control, while their shared dependence on cysteine ensures a collective response mediated by ribosome pausing. We propose the term ribulon to classify ribosome-directed regulons. Regulon-level coordination by ribosomes on sensory short ORFs illustrates one utility of the many unannotated short ORFs expressed in bacterial genomes.

Keywords: cysteine; mycobacteria; operon; polycysteine; regulon; short ORFs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic comparison of structural features of operon attenuation. A. In transcriptional attenuation, the balance of dueling mRNA hairpin structures is controlled by ribosome occupancy of the sensory sORF. Rapid ribosome transit--when tRNAtrp is plentiful, for example--allows the vacated sORF mRNA to hybridize with complementary sequence thereby forming the stem of an intrinsic transcriptional terminator. This transcriptional termination keeps the downstream tryptophan biosynthetic genes from unneeded expression. Under conditions of reduced tRNAtrp levels, ribosome pausing in the sORF blocks the attenuating transcriptional terminator from forming, an anti-terminator structure forms instead, which promotes RNA polymerase extension into the operon. Transcriptional attenuation through controlled intrinsic termination occurs in E.coli tryptophan and S. typhimurium histidine biosynthetic loci, and the whiB7 locus in Mycobacterium abscessus (Burian and Thompson, 2018, Johnston et al., 1980, Yanofsky, 1981). B. In translational attenuation, sensory sORF mRNA base pairing obscures the SD of the downstream gene, preventing translation initiation. Ribosome arrest in the sORF prevents duplex formation, and consequently frees the SD to recruit and position translation initiation complexes for expression of the annotated gene. Translation attenuation has been described for some ribosome antibiotics (Arenz et al., 2014, Lovett and Rogers, 1996, Ito and Chiba, 2013).
Figure 2.
Figure 2.
Polycysteines are enriched in LL-sORFs. The frequency of consecutive amino acids is shown in annotated genes (X-axis) and in LL-sORFs (Y-axis). Frequencies for each clustered amino acid (at least two consecutive) are expressed as percentages of its total in each group (see Table S2 for tallies).
Figure 3.
Figure 3.
The architecture of the Ms5788A to Ms5790 locus and the polycysteine-encoding mRNA leader. The amino acid sequence of the wild-type Ms5788A includes 9 cysteine codons (bold C), with eight consecutive codons at the C-terminus. This leader sequence to Ms5788 is predicted to fold into a stable duplex structure (Fig S2), represented here in schematic form and annotated with features relevant for regulated expression of Ms5788. The ORF defined by the initiating GUG fMet is shown as a magenta line, with key sequences indicated.
Figure 4.
Figure 4.
Hypothesized model and ribosome dependence of Ms5788A-directed attenuation. A. Ms5788A is transcribed (RNA polymerase, RNAP) and translated (orange ribosome). Ribosomes waiting (left) for charged tRNAcys prevent the leader from folding into a stable, attenuating, conformation (right). The nascent small protein is shown emerging from the elongating ribosome and is annotated with the initiating methionine (fM) and polycysteine (C) tract for reference. Except where noted, luciferase reporters are translational fusions of the NanoLuc ORF in-frame with the AUG initiating methionine of the annotated Ms5788 gene. B. The same wild-type (i) reporter cysteine responsive data are shown in Figs 4–6 (black outlined or filled boxes) as a comparative reference for the mutant derivatives of each series (blue outlined or filled boxes in Fig 4). Non-start (ii) and nonsense (iii) mutations reduce ribosome occupancy and increase attenuation of the reporter. An out-of-frame mutation permits ribosome occupancy but is not cysteine responsive (iv). Lower case letters below clones indicate nucleotide substitutions and insertion/deletions (+/−) used to create mutants. Luciferase activity is shown as light units (× 109) per ml of cell culture, and asterisks indicate significance p <.01 for cysteine supplementation, by two-tailed t-test in Figs 4–6.
Figure 5.
Figure 5.
Predicted duplex mRNA interactions are required for the attenuated state of the Ms5788 luciferase reporter fusion. Clustered point mutations were introduced into the Ms5788A LL-sORF or the predicted pairing nucleotides to disrupt base-pairing but, when combined, should restore the modeled stable mRNA structure. Two series of mutants were generated (angled hatch fill, ii – iv and horizontal and vertical hatch fill, v – vii). The effect of structure-destabilizing mutations and a re-stabilizing combination were tested for their effect on the luciferase reporter fused to Ms5788. All three clustered mutants (ii, iii, v) that are expected to disrupt the predicted duplex structure model an unattenuated state, whereas a G:U stabilized mutant (vi) or combined complement mutants (iv, vii) retain cysteine responsiveness. See Fig S3 for G:U stabilized structure detail.
Figure 6.
Figure 6.
Attenuation regulates translation initiation of Ms5788. Fifteen nt were inserted at the end of the wild-type leader and included an independent SD to assess mRNA extension beyond the structured attenuator (ii). Luciferase activity is now independent of the effects of cysteine supplementation on the attenuating structure. Placement of the luciferase reporter at the initiation codon of Ms5789 (iii) indicates that translation attenuation in Ms5788 has polar effects, propagating the control of Ms5788A to operonic genes.
Figure 7.
Figure 7.
Cysteine attenuation is observed in the native context. A. Label-Free Quantitative proteomics (LFQ) was used to identify and quantitate changes in the abundance of proteins from whole-cell extracts of wild-type M. smegmatis subjected to trypsin digestion and nanoUHPLC-MS/MS. M. smegmatis was cultured in minimal medium (X-axis) or minimal medium supplemented with cysteine (Y-axis). Values plotted are normalized LFQ peak area (protein) in arbitrary units. The baselines for both X and Y-axes were artificially set at 5 ×104 counts (indicated by dashed limit of detection) to allow depiction of proteins not expressed in one of the two conditions. Peptides from Ms5789 and Ms5790 (white diamond 1 and 2, respectively) increased in abundance (shift right along the X-axis) under cysteine limitation. Peptides of annotated proteins from loci similarly encoded on polycysteine-encoding LL-sORF mRNAs are indicated by black filled circles (Fig S4; 3 = Ms0113; 4 = Ms0114; 5 = Ms0934; 6 = Ms4527; 7 = Ms4533; 8 = Ms5279; 9 = Ms5280). B. A frameshift mutation (OoF) in Ms5788A that changes the CCCCCCCC* to VAVVVVAVERSRAL*. This OoF LL-sORF mutant is insensitive to cysteine and does not release Ms5789 and Ms5790 from an attenuated state. C. Deleting Ms5788A elevates expression of Ms5789 and Ms5790 and is insensitive to cysteine, indicating a completely unattenuated state.
Figure 8.
Figure 8.
Phylogenetic distribution of polycysteine LL-sORFs in mycobacteria. A robust reference phylogenetic tree was constructed from complete genome sequences of 41 diverse Mycobacterium spp. All nodes had bootstrap values of 100, except where indicated. Species of the slow-grower clade appear in blue text, and fast-growers in magenta. The presence (black square) or absence (white square) of each LL-sORF forms a binary barcode for each species. For example, the presence of Ms0113A in some fast-growing species is indicated by a black square in the leftmost box of the barcode. Barcode key and fixed-length sequence logo for each LL-sORF is shown at right.
Figure 9.
Figure 9.
Schematic comparison of a traditional transcriptionally coordinated regulon and the translationally controlled ribulon. A. Regulons are usually controlled at the transcriptional level and are named for their master regulator (MR). In this scenario, coordinate gene expression from multiple loci, is mediated by the MR activating transcription. The MR is shown binding to multiple loci, which activates transcription and subsequent translation by the ribosome. This includes activation of sigma factors (e.g., Sigma F flagellar regulon in Salmonella typhimurium (Ohnishi et al., 1992)), inactivation of a repressor (e.g., fur regulon in E. coli (Stojiljkovic et al., 1994)), or two-component activation (e.g., slyA regulon in Salmonella typhimurium (Zhao et al., 2008)). B. In the Cys ribulon, global gene coordination occurs by mutual pausing of ribosomes in ribulon sORFs. Attenuation at individual LL-sORFs at independent operons is co-dependent on tRNAcys levels, resulting in coordinated gene expression. In cysteine replete conditions, polycysteine sORFs are rapidly translated and expression of downstream genes is repressed. Under conditions of limiting tRNAcys levels, ribosomes pause at polycysteine tracts in the sensory LL-sORFs and relieve attenuation of operonic genes to upregulate processes required for the production of cysteine. Sensory sORFs are shown as black boxes, stalled ribosomes are red-shaded, activated genes are indicated by open rectangles and translated by elongating ribosomes (green-shaded) with the emerging nascent protein.

Similar articles

Cited by

References

    1. ARENZ S, MEYDAN S, STAROSTA AL, BERNINGHAUSEN O, BECKMANN R, VAZQUEZ-LASLOP N & WILSON DN 2014. Drug sensing by the ribosome induces translational arrest via active site perturbation. Mol Cell, 56, 446–52. - PMC - PubMed
    1. BARKAN D, STALLINGS CL & GLICKMAN MS 2011. An improved counterselectable marker system for mycobacterial recombination using galK and 2-deoxy-galactose. Gene, 470, 31–6. - PMC - PubMed
    1. BECHHOFER DH 1990. Triple post-transcriptional control. Mol Microbiol, 4, 1419–23. - PubMed
    1. BECK HJ & MOLL I 2018. Leaderless mRNAs in the Spotlight: Ancient but Not Outdated! Microbiol Spectr, 6. - PMC - PubMed
    1. BOSSERMAN RE, NGUYEN TT, SANCHEZ KG, CHIRAKOS AE, FERRELL MJ, THOMPSON CR, CHAMPION MM, ABRAMOVITCH RB & CHAMPION PA 2017. WhiB6 regulation of ESX-1 gene expression is controlled by a negative feedback loop in Mycobacterium marinum. Proc Natl Acad Sci U S A, 114, E10772–E10781. - PMC - PubMed

Publication types

MeSH terms