Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 28;10(1):2880.
doi: 10.1038/s41467-019-10912-8.

A high-throughput screening and computation platform for identifying synthetic promoters with enhanced cell-state specificity (SPECS)

Affiliations

A high-throughput screening and computation platform for identifying synthetic promoters with enhanced cell-state specificity (SPECS)

Ming-Ru Wu et al. Nat Commun. .

Abstract

Cell state-specific promoters constitute essential tools for basic research and biotechnology because they activate gene expression only under certain biological conditions. Synthetic Promoters with Enhanced Cell-State Specificity (SPECS) can be superior to native ones, but the design of such promoters is challenging and frequently requires gene regulation or transcriptome knowledge that is not readily available. Here, to overcome this challenge, we use a next-generation sequencing approach combined with machine learning to screen a synthetic promoter library with 6107 designs for high-performance SPECS for potentially any cell state. We demonstrate the identification of multiple SPECS that exhibit distinct spatiotemporal activity during the programmed differentiation of induced pluripotent stem cells (iPSCs), as well as SPECS for breast cancer and glioblastoma stem-like cells. We anticipate that this approach could be used to create SPECS for gene therapies that are activated in specific cell states, as well as to study natural transcriptional regulatory networks.

PubMed Disclaimer

Conflict of interest statement

MW, LN, and TKL have filed patent applications (application number: 62/470754) on the work. MW, LN, and TKL are inventors on this patent. TKL is a co-founder of Senti Biosciences, Synlogic, Engine Biosciences, Tango Therapeutics, Corvium, BiomX, and Eligo Biosciences. TKL also holds financial interests in nest.bio, Ampliphi, and IndieBio. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The experimental and computational pipeline for identifying cell state-specific promoters. a The experimental pipeline consisted of infecting cells with synthetic promoter libraries encoded on lentiviruses, FACS sorting of cells into subpopulations according to fluorescence intensity, next-generation sequencing (NGS), and computational analysis to identify the promoters enriched in each subpopulation. From top to bottom, the promoters in the library contained tandem repeats of a single transcription factor (TF) binding site (BS) (colored boxes). Cells of different cell states (e.g., normal vs. cancer) were infected with the pooled library and then sorted by FACS into bins based on fluorescence intensity. For each bin, NGS was performed to determine the abundance of each promoter in each bin. Finally, a machine-learning based prediction was used to determine the activity of each promoter and its cell state specificity (e.g., light blue indicates that the promoter is specific to cancer cells whereas light green indicates that the promoter is specific to normal cells). b The cells infected with the promoter library were FACS sorted into five subpopulations according to fluorescence intensity (negative, low, high, top 5–10%, top 5%), followed by NGS and computational analysis to identify the promoters enriched in each subpopulation
Fig. 2
Fig. 2
Synthetic promoters exhibit distinct temporal and spatial behavior in organoid cultures derived from iPSCs. a The heat maps show distinct temporal and spatial activities of four promoters across the time course of differentiation. The X-axis denotes the days post Dox-induced differentiation. The Y-axis denotes the fluorescence intensity as the pixel value of an 8-bit image (fluorescence intensity is equally divided into 256 bins, 0 being the lowest, and 255 being the highest). Heat map colors show the relative frequencies of pixel fluorescence intensity distribution in each bin with a log pseudocount to account for absent bins [(1 + number of pixels in each fluorescence intensity bin/number of total pixels)]. The distributions show the difference in the timing and strength of promoter activation, and the fraction of the image containing fluorescent cells. The negative control sample consisted of cells infected with a non-fluorescent protein; the positive control sample consisted of cells infected with a Ubiquitin C promoter expressing mKate2. b Representative fluorescence and bright field microscopy images show distinct temporal and spatial activities and differences in expression strength of the four promoters. The sub-regions exhibiting the strongest fluorescence signal for each promoter are shown. Left panel contains the bright field images (Days 4–19), middle panel contains the overlay images (Days 4–19), and right panel contains the fluorescence images (Days 4–19). c The heat maps show the relative frequencies of pixel distribution in each fluorescence bin for the representative fluorescence microscopy images in b. N = 3 biological replicates
Fig. 3
Fig. 3
Machine-learning based prediction model can efficiently predict cell state specificity. a Validation guided by machine-trained algorithms. We selected 54 promoters predicted to be specific to either of the cell states or to have a range of fluorescence in either cell state (defined as four “classes” of promoters). Specific promoters showed up to ~1000-fold difference in activity between cell states and exhibited activity as strong as that of a constitutive promoter (Ubiquitin C promoter) commonly used for gene expression (also used as the positive control, Pos. Control). The negative control sample (Neg. Control) consisted of cells infected with a non-fluorescent protein. Names refer to the TF-BS in the promoter. All the promoters shown here are taken from the newly generated validation set, except for MAFK v1, which was identified by the Top 5% approach, and MAFG, which was taken from the training data. The dots represent the values of three biological replicates. b The machine-learning based prediction model achieved a Pearson R2 of 0.77 between the prediction and true fluorescence measured by FACS (log2 scaled) on a held-out test set. c Inspecting the predicted fold difference of all promoters in the library showed that there were plenty of promoters specific to each cell state. The Top 5% approach identified cell state-specific promoters (in red) in a significant manner (p = 0.0016, Wilcoxon rank sum test, two-sided). Error bars represent S.E.M., N = 3 biological replicates. Source data are provided as a Source Data file
Fig. 4
Fig. 4
Promoter activities in glioblastoma stem-like cells (GSCs) and serum-cultured glioblastoma cells (ScGCs). Thirty promoters predicted to be specific to either MGG4 ScGCs or GSCs were validated (defined as two “classes” of promoters). Among the 15 promoters predicted to be ScGC-specific, five showed >10-fold higher activity in ScGCs compared to GSCs, ranging from 27-fold to 460-fold higher activity. Among the 15 promoters predicted to be GSC-specific, one showed 100-fold higher activity in GSCs compared to ScGCs. The upper panel depicts the median fluorescence intensity of each promoter. The blue bars denote the activity in MGG4 GSCs, and the yellow bars denote the activity in MGG4 ScGCs. The lower panel shows the log10 difference in activity between MGG4 ScGCs and GSCs for each promoter. The name on the X-axis denotes the TF-BS of each promoter. The dots represent the values of three biological replicates. Error bars represent S.E.M., N = 3 biological replicates. Source data are provided as a Source Data file

References

    1. Levo M, Segal E. In pursuit of design principles of regulatory sequences. Nat. Rev. Genet. 2014;15:453–468. doi: 10.1038/nrg3684. - DOI - PubMed
    1. Lelli KM, Slattery M, Mann RS. Disentangling the many layers of eukaryotic transcriptional regulation. Annu Rev. Genet. 2012;46:43–68. doi: 10.1146/annurev-genet-110711-155437. - DOI - PMC - PubMed
    1. Hwang A, Maity A, McKenna WG, Muschel RJ. Cell cycle-dependent regulation of the cyclin B1 promoter. J. Biol. Chem. 1995;270:28419–28424. doi: 10.1074/jbc.270.45.27058. - DOI - PubMed
    1. Saukkonen K, Hemminki A. Tissue-specific promoters for cancer gene therapy. Expert Opin. Biol. Ther. 2004;4:683–696. doi: 10.1517/14712598.4.5.683. - DOI - PubMed
    1. Dorer DE, Nettelbeck DM. Targeting cancer by transcriptional control in cancer gene therapy and viral oncolysis. Adv. Drug Deliv. Rev. 2009;61:554–571. doi: 10.1016/j.addr.2009.03.013. - DOI - PubMed

Publication types