Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May;15(5):323-329.
doi: 10.1038/nmeth.4633. Epub 2018 Mar 19.

Metagenomic mining of regulatory elements enables programmable species-selective gene expression

Affiliations

Metagenomic mining of regulatory elements enables programmable species-selective gene expression

Nathan I Johns et al. Nat Methods. 2018 May.

Abstract

Robust and predictably performing synthetic circuits rely on the use of well-characterized regulatory parts across different genetic backgrounds and environmental contexts. Here we report the large-scale metagenomic mining of thousands of natural 5' regulatory sequences from diverse bacteria, and their multiplexed gene expression characterization in industrially relevant microbes. We identified sequences with broad and host-specific expression properties that are robust in various growth conditions. We also observed substantial differences between species in terms of their capacity to utilize exogenous regulatory sequences. Finally, we demonstrate programmable species-selective gene expression that produces distinct and diverse output patterns in different microbes. Together, these findings provide a rich resource of characterized natural regulatory sequences and a framework that can be used to engineer synthetic gene circuits with unique and tunable cross-species functionality and properties, and also suggest the prospect of ultimately engineering complex behaviors at the community level.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Metagenomic mining and high-throughput characterization of regulatory sequences from 184 prokaryotic genomes. Unidirectional intergenic regions (>200 bp) were extracted from annotated genomes, trimmed to 165 bp, and assigned unique barcodes, flanking restriction sites, and amplification sequences. The regulatory library was then synthesized on an oligo microarray, amplified, cloned as a pool into species-specific vectors, and transformed into B. subtilis, E. coli, and P. aeruginosa recipients. Targeted RNA-seq, DNA-seq, and FACS-seq enables accurate multiplexed measurement of transcription and translation levels.
Figure 2
Figure 2
Transcriptional activities of the regulatory library across 3 diverse species. (a) Transcriptional activity of 11,319 regulatory sequences measured in B. subtilis, E. coli, and P. aeruginosa are shown in the heatmap with host-specific groupings annotated above and general categories below. Transcription levels are log2 (RNA/DNA) ratios normalized by the mean activity of control sequences (see Methods). (b) A histogram showing the GC content of the RS library and only the universally active subset, highlighting AT-bias of active RSs. (c) The activity profiles of RSs from three distinct phylogenetic groups (red: Bacillaceae, orange: Enterobacteriaceae, and blue: Pseudomonadaceae) measured in each recipient species are shown as fraction active (left) and normalized activity level displayed as a violin plot (right). Box plots (black) with mean values (white dots) are displayed over each violin plot. Cases where donor RS and recipients share the same phylogeny are highlighted in dashed black borders. Sample sizes (n) are listed in parentheses below distributions.
Figure 3
Figure 3
Assessing regulatory features that govern transcriptional activity. (a) Distributions of transcriptional activity is shown for each host. A subset of 200 sequences from the top 10% most active promoters in each recipient were used for separate motif analyses, yielding the dominant σ70 motif. (b) Transcription activity is correlated with biophysical parameters: promoter GC content (left), maximum σ70 match score (center), mRNA structural stability (right). Mean activities for each feature window are shown with error bars denoting standard errors. (c) Linear regression model using the three biophysical parameters. Excluding promoters used to identify the σ70 motif, the training and test set for the regression model corresponds to 10% and 90% of the data, respectively. A subset of 500 points is displayed with higher point size to improve visualization. Sample sizes (n) and Pearson correlation coefficients (r) are listed in each subplot.
Figure 4
Figure 4
FACS-seq of RS library. (a) Sorting of RS library (top) and the fraction of population sorted into each bin for each host (bottom). (b) Heatmap panels show the fraction of RS library distributed across bins of transcription and translation levels in three recipients. The top row of each heatmap subpanels uses values normalized by the total number of regulatory sequences. The middle row uses values normalized by each column bin corresponding to transcription windows. The bottom row uses values normalized by each row bin corresponding to translation windows. (c) Pie charts showing fraction of RS library that are transcriptionally active (in orange) and with translational level >1.5 (in blue) based on bins in (b).
Figure 5
Figure 5
Species-selective Gene Circuits (a) Design of Species-selective Gene Circuits (SsGC) with specified host expression profiles using two outward facing regulatory sequences buffered by a strong bidirectional terminator to drive expression of two fluorescence genes, mCherry and sf-GFP. The pNJ6.2 vector is transformable into B. subtilis, E. coli, and P. aeruginosa. (b) Combinatorial construction and fluorescence characterization of 12 host-specified regulatory sequences (Seq ID 1-12) into 10 SsGCs of different regulatory profiles in three recipient species are shown. Distinct regulatory categories include universally active (constructs A-C), B. subtilis-excluding or E. coli-excluding in the GFP channel (constructs D-E or F-G, respectively), E. coli-excluding in the mCherry channel (constructs H-I), and P. aeruginosa-specific in the mCherry channel (construct J).

References

    1. Brophy JA, Voigt CA. Principles of genetic circuit design. Nat Methods. 2014;11:508–520. doi: 10.1038/nmeth.2926. - DOI - PMC - PubMed
    1. Kosuri S, Church GM. Large-scale de novo DNA synthesis: technologies and applications. Nat Methods. 2014;11:499–507. doi: 10.1038/nmeth.2918. - DOI - PMC - PubMed
    1. Bayer TS, et al. Synthesis of methyl halides from biomass using engineered microbes. J Am Chem Soc. 2009;131:6508–6515. doi: 10.1021/ja809461u. - DOI - PubMed
    1. Stanton BC, et al. Genomic mining of prokaryotic repressors for orthogonal logic gates. Nat Chem Biol. 2014;10:99–105. doi: 10.1038/nchembio.1411. - DOI - PMC - PubMed
    1. Rhodius VA, et al. Design of orthogonal genetic switches based on a crosstalk map of sigmas, anti-sigmas, and promoters. Mol Syst Biol. 2013;9:702. doi: 10.1038/msb.2013.58. - DOI - PMC - PubMed

Online Methods References

    1. Guerout-Fleury AM, Frandsen N, Stragier P. Plasmids for ectopic integration in Bacillus subtilis. Gene. 1996;180:57–61. - PubMed
    1. Newman JR, Fuqua C. Broad-host-range expression vectors that carry the L-arabinose-inducible Escherichia coli araBAD promoter and the araC regulator. Gene. 1999;227:197–203. - PubMed
    1. Pedelacq JD, Cabantous S, Tran T, Terwilliger TC, Waldo GS. Engineering and characterization of a superfolder green fluorescent protein. Nat Biotechnol. 2006;24:79–88. doi: 10.1038/nbt1172. - DOI - PubMed
    1. Markowitz VM, et al. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res. 2012;40:D115–122. doi: 10.1093/nar/gkr1044. - DOI - PMC - PubMed
    1. LeProust EM, et al. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 2010;38:2522–2540. doi: 10.1093/nar/gkq163. - DOI - PMC - PubMed

Publication types

MeSH terms