Estrogen-dependent control and cell-to-cell variability of transcriptional bursting
- PMID: 29476006
- PMCID: PMC5825209
- DOI: 10.15252/msb.20177678
Estrogen-dependent control and cell-to-cell variability of transcriptional bursting
Abstract
Cellular decision-making and environmental adaptation are dependent upon a heterogeneous response of gene expression to external cues. Heterogeneity arises in transcription from random switching between transcriptionally active and inactive states, resulting in bursts of RNA synthesis. Furthermore, the cellular state influences the competency of transcription, thereby globally affecting gene expression in a cell-specific manner. We determined how external stimuli interplay with cellular state to modulate the kinetics of bursting. To this end, single-cell dynamics of nascent transcripts were monitored at the endogenous estrogen-responsive GREB1 locus. Stochastic modeling of gene expression implicated a two-state promoter model in which the estrogen stimulus modulates the frequency of transcriptional bursting. The cellular state affects transcriptional dynamics by altering initiation and elongation kinetics and acts globally, as GREB1 alleles in the same cell correlate in their transcriptional output. Our results suggest that cellular state strongly affects the first step of the central dogma of gene expression, to promote heterogeneity in the transcriptional output of isogenic cells.
Keywords: bursting; estrogen signaling; heterogeneity; stochastic modeling; transcription.
© 2018 The Authors. Published under the terms of the CC BY 4.0 license.
Figures
Knock‐in strategy to integrate PP7 sequences into a GREB1 locus in MCF7 cells. CRISPR/Cas9‐mediated knock‐in of PP7 sequences, together with a selection cassette, into the 5′ UTR within exon 2 of GREB1 was followed by excision of the selection cassette by Cre recombinase to yield the cell line MCF7‐GREB1‐PP7 (ERE: estrogen response element, HA: homology arm, pA: polyadenylation site, Puro: puromycin resistance, IRES: internal ribosomal entry site, CMV: promoter of cytomegalovirus).
Schematic description of the PP7 system. Binding of GFP‐labeled PP7 coat protein (tdGFP‐tdPCP) to PP7 stem‐loops within nascent transcripts leads to fluorescence accumulation at the transcription site. Spot intensity decreases upon termination and transcript release. A schematic description of the fluorescence signal of a single transcript is shown below, with the 30 min which a transcript is observable estimated from gene length and published Pol II elongation rates.
Transcriptional foci in MCF7‐GREB1‐PP7 cells grown at low and high concentrations of E2. Single fluorescent foci (arrowheads) are observed within nuclei due to nuclear localization of tdGFP‐tdPCP. Maximum intensity projections of z‐stacks are shown. Scale bar: 10 μm.
E2 dose‐response. Transcription sites were automatically identified and quantified in images of fixed MCF7‐GREB1‐PP7 cells. The mean ± standard deviation from three biological replicates of > 3,000 cells per condition is shown along with a fitted Hill function. ICI 182,780 (pure anti‐estrogen) and actinomycin D (ActD; transcriptional inhibitor) serve to prevent transcription at 100 pM E2.
Endogenous E2‐initiated transcription occurs in bursts. MCF7‐GREB1‐PP7 cells were imaged for 13 h at 10 pM E2. Transcription sites (red cross) were tracked within nuclei (dashed line). A zt‐kymograph of the tracked transcription site demonstrates stable focus. Quantified transcript numbers are shown for a transcription site (red) and a control site at the center of the nucleus (gray). Scale bar: 5 μm.
Validation of successful genome engineering. Genotyping PCRs were performed on genomic DNA with primers positioned along the transgene as indicated in the scheme above. PCR products confirm successful recombination after Cas9‐mediated cleavage on both 3′ and 5′ ends of the construct, as well as successful Cre‐mediated recombination between the two loxP sites (wt: wild‐type locus, rec: locus after Cre recombination).
The estrogen sensitivity of knock‐in and wild‐type (wt) allele is comparable. E2 dose‐response curves were measured by allele‐specific RT–qPCR after starving cells of E2 for 3 days followed by induction for 18 h at the indicated E2 concentrations (data points). Hill functions were fitted (lines), and the resulting EC50 is indicated. They confirm unperturbed sensitivity (EC50) for E2 of the knock‐in allele with minor differences in maximal RNA levels that most probably result from altered transcript stability. Error bars represent standard deviation from four biological replicates.
The fraction of cells with visible transcription sites increases with E2 concentration. Transcription sites were automatically identified and quantified in fixed cells as in Fig 1D. The percentage of nuclei with detectable spots was obtained. Mean and standard deviation from three biological replicates are plotted alongside with the fit of a Hill function. ICI 182,780 (ICI) and actinomycin D (ActD) were applied at 100 pM E2.
The single‐molecule RNA FISH signal of exonic and intronic signal overlaps with the GFP signal at transcription sites. RNA FISH was performed with probes against exonic (red) and intronic (blue) regions of GREB1. Single RNAs are visualized by exonic probes as diffraction limited spots in the nucleus (dashed line) and cytoplasm of the knock‐in cell line. Bright foci in the nucleus correspond to nascent RNA at the transcription sites that are visualized by intronic and exonic probes. One of three nuclear foci (arrowhead and inset) co‐localizes with GFP signal from the PP7 system. Scale bars: 5 and 0.5 μm in the inset.
Quantification of FISH signals determines a correlation between exonic, intronic, and GFP signal at transcription sites; 87% of bright nuclear exonic foci co‐localize (distance < 5 px) with intronic foci, while only 26% of them co‐localize with GFP foci, indicating that one in three foci is labeled with GFP. Co‐localizing foci also correlate in intensity.
Calibration of spot intensities by matching live‐cell imaging distributions to absolute smRNA FISH signals. The mean intensity of single RNAs (solid vertical line) was derived from FISH images of GREB1 exons at 100 pM E2. The intensity distribution of bright nuclear foci in smRNA FISH (> 10 RNAs, red) was matched with the intensity distribution of transcription sites from live‐cell imaging at 100 pM E2 (green), indicating that an estimated maximum of 150 RNAs occurs within a transcription site.
Dose dependence of E2‐dependent transcription. Nuclear smRNA FISH signals were quantified at various E2 concentrations. The mean intensity of the two brightest nuclear foci without GFP signal (wild‐type alleles) is comparable to the mean intensity of the brightest spot co‐localizing with GFP (knock‐in allele). Both show an E2‐dependent increase that is reduced upon addition of 1 μM ICI 182,780 (ICI).
Absolute calibration of spot intensities by counting single PP7‐labeled GREB1 RNA molecules. (Left) MCF7‐GREB1‐PP7 cells were grown in 1,000 pM E2, and images were acquired at maximum light intensity. Single RNAs (red arrowheads) are apparent as dim spots, often in close proximity to the transcription site (blue asterisk). A maximum intensity projection of bandpass‐filtered images is shown in the middle. The filtering accentuates particles which are then quantified by fitting to a three‐dimensional Gaussian distribution, as seen on the right for the particle marked with the red arrowhead with white border. Scale bars: 5 and 1 μm for the single spot. (Middle) Intensity distribution of single transcripts. The histogram of spot intensities was fitted to a Gaussian function. The mean of the Gaussian function is indicated in the legend. (Right) Transcription sites were imaged at the same stage position under imaging conditions for long‐term live‐cell imaging (2% light intensity) and conditions for visualization of single transcripts (100% light intensity). Their intensities were quantified and the fitted ratio of intensities used to calculate the equivalent intensity of a single transcript under live‐cell imaging conditions.
- A
Cell‐to‐cell variation in GREB1 expression at multiple E2 concentrations. Transcription was observed for 13 h in individual MCF7‐GREB1‐PP7 cells at different E2 concentrations. Trajectories of single cells are represented as color maps and are sorted from lowest (top) to highest (bottom) total RNA output (ΣRNA), calculated as area under the curve divided by the average signal from single transcripts (right). Color denotes the absolute number of nascent RNAs. The squared coefficient of variation (standard deviation2/mean2) of the total RNA output among the population is indicated (right). Dashed lines indicate exemplary cells shown in panels (B and C).
- B
Representative time traces for low, medium, and high expressing cells.
- C
Autocorrelation (ACF) curves of the individual time traces in (B).
- D
Increasing E2 increases the proportion and productivity of transcriptionally active periods. Histograms were generated from the number of RNA molecules at the transcription site from all data points at different E2 concentrations. The distribution of the background signal is shown in gray.
- E, F
Increasing E2 concentrations decrease the length of inactive periods and increase the burst size. Promoter OFF‐times (E) and burst sizes (F) were extracted from the experimental tracks (see Fig EV2E). Exponential functions (red) with the same mean (dashed line) are shown for the OFF‐times.
- A–C
Raw data, example trajectories and autocorrelation functions (ACF) as in Fig 2, are shown for all eight E2 concentrations.
- D
Histograms of the number of RNA molecules at transcription sites. The distribution of nascent RNAs from all eight live‐cell datasets over all cells and time points is plotted (left), with the background signal shown in gray. The intensity distribution from smRNA FISH (right) is comparable across E2 concentrations.
- E
Extraction of ON‐ and OFF‐times from raw time traces. A simple thresholding strategy was used on the slope of the median‐filtered time trace to separate transcriptionally active from inactive periods. The duration of each ON‐ and OFF‐time was calculated. For each ON‐time, the burst size was calculated as the increase in the number of nascent RNAs during an ON‐time, and the initiation rate was derived as the ratio of burst size and ON‐time.
- F
Distribution of OFF‐ and ON‐times indicates that single rate‐limiting steps occur during the transition between promoter states. Burst features were extracted from time traces in panel (A) as described above. The mean value is indicated in the plots (dashed line). Exponential distributions with the same mean are plotted for the OFF‐ and ON‐times (red) and indicate good agreement. Durations in the range of the imaging interval cannot be reliably estimated leading to deviation at short ON‐durations.
Topologies and kinetic parameters of models of different complexity. Top: The promoter cycles stochastically through active (ON) and inactive (OFF) states with rates k ON and k OFF. New transcripts are initiated stochastically from the active state(s) with rate k init. Extrinsic noise is implemented by cell‐to‐cell variability in model parameters (e.g., k elong). Bottom: Signal of individual transcripts for fast (solid) and slow (dashed) elongation kinetics (k elong).
Model fit to experimental data. Time course features used for model fitting are shown for datasets obtained at 5 pM (top) and 100 pM (bottom) E2 (red) and the 500 best particles obtained by model fitting (gray) (ACF: autocorrelation function)
Model selection favors a two‐state promoter cycle with cell‐to‐cell variability in elongation and initiation rate. Posterior frequencies are shown for all 40 model topologies integrated over all eight E2 concentrations.
Model fitting yields close‐to‐optimal description of synthetic data. Distance distributions of the final particle populations after SMC ABC fitting to synthetic data are shown in gray for each of the benchmark models (y‐axis), box plots indicating the variance of the 2,000 best particles. For comparison and to estimate the best possible distance, the distance measure was calculated between 500 randomly paired simulations using the true model parameters based on which the synthetic data were generated (orange). The dashed gray line denotes a value of 0.5, which was chosen as the final convergence limit of the SMC ABC algorithm.
SMC ABC accurately identifies true parameter values from benchmark datasets. Posterior distributions of model parameters for all benchmark datasets are presented. Red dots indicate the values used to generate the benchmarking datasets. Boxplots show posterior distributions for initiation rate, promoter ON‐time, burst size, and promoter OFF‐time.
SMC ABC determines the correct model topology. The frequency of models in the posterior distribution is shown as color maps for all benchmark datasets in (A). Red dots mark the true model.
Distances of the final particle populations after SMC ABC model fitting to datasets at various estrogen concentrations (Fig EV2A). The dashed gray line indicates a distance value of 0.5 for an optimal fit, as estimated in panel (A).
Posterior parameter distributions. Boxplots show posterior distributions of burst size, ON‐time, and OFF‐time after SMC ABC fitting to all datasets.
Model selection. The frequency of all forty models in all eight final particle populations after SMC ABC model fitting is shown. Only three models (*) were found in all eight posterior distributions (i.e., at all estrogen concentrations). The columns “Topology” and “Variability” indicate the number of ON/OFF promoter states and possible combinations of parameter resampling as specified on the y‐axis in panel (C).
Incorporating extrinsic noise in the model is necessary to fit the data at all E2 concentrations. The final distance distribution of particles after SMC ABC when fitting defined model topologies with and without extrinsic noise over different E2 concentrations is shown. Fitting was performed while fixing a two‐state model topology that either includes no extrinsic noise (1–1–0) or a combination of extrinsic noise in k elong and k init (1–1–5). In addition, two models comprising a single source of extrinsic noise in either k elong (1–1–1) or k init (1–1–2) were fitted to the 10 pM dataset, with this combination of both sources yielding the best fit.
Variability of response times depends on the number of steps in the promoter cycle. Heterogeneity in response times was calculated as the coefficient of variation (CV) from simulations of E2 induction experiments. The distribution of CVs over all SMC ABC posterior particles from the fit to steady‐state dose‐response at 1,000 pM E2 (Fig 3) with models containing either two (left) or ten (right) states is shown as boxplots.
The kinetics of E2 induction in single living cells is similar to ensemble RT–qPCR measurements. MCF7 cells were starved of E2 for 3 days and then induced with either 10 pM (left) or with 1,000 pM E2 (right) with samples being collected every 10 min for RT–qPCR. The location of the qPCR products and of the PP7 sequences is indicated on the schematic of the GREB1 gene structure. Primer pairs were designed to span exon–intron boundaries to exclusively amplify unspliced pre‐mRNA. The mean of three experiments is plotted with standard deviation shown as shaded areas. Response times and delay between amplicons are estimated from the time taken to achieve a threefold induction. The mean intensity of transcription sites from Fig 4A is shown in green and shows excellent agreement with population and single‐cell response.
Minimal promoter models yield a better model fit for induction datasets. SMC ABC was performed with the 10 or 1,000 pM E2 induction datasets, while keeping the model topology fixed at a two‐state (1–1–5) or ten‐state (1–9–5) model. Distances of the final particle populations are shown as boxplots. For both experimental datasets, the two‐state model yields a smaller final distance value than the ten‐state model, indicating better description of the data.
Promoter models with multiple states produce homogenous response times. Simulations of synchronized cells were performed with parameters derived from fits of the 10 pM induction dataset for a promoter cycle with 10 states (t ON = 0.8 min; t OFF = 30 min; b = 6 RNAs/burst, model topology: 1–9–5). The median and CV of response times are indicated in red.
Response times after E2 induction are highly variable. Transcriptional trajectories of labeled GREB1 loci upon addition of a sub‐saturating (10 pM, top) and saturating (1,000 pM, bottom) E2 concentration were sorted based on their response time. MCF7‐GREB1‐PP7 cells were grown under E2‐free conditions for 48 h prior to imaging. After 51 min of imaging, E2 was added at the indicated time point. Median response times are indicated (red arrow).
Simulated synchronization recapitulates variability in induction. Simulation of synchronized cells with parameters obtained from fits of the two‐state (1–1–5) model to the 10 pM (top) and 1,000 pM E2 (bottom) synchronization datasets (panel A) produce trajectories that closely resemble the experiments (t ON = 0.8 min; t OFF, 10 pM = 60 min; t OFF, 1,000 pM = 30 min; b = 6 RNAs/burst; model topology: 1–1–5).
Experimental response‐time heterogeneity is explained by small promoter models. Response times were extracted from simulated induction experiments for all posterior particles of the SMC ABC fit (described in B). The boxplots show how the coefficients of variation (CV) in the response times (calculated over all cells of each simulated cell population) are distributed over the posterior particles (central line: median, box: 25 and 75% percentiles, whiskers: 5 and 95% percentiles). Experimental CVs are indicated as dashed lines with standard deviation from bootstrapping as shaded areas. A two‐state model provides similar response‐time heterogeneity as the data.
Observation of transcriptional bursts from two GREB1 alleles in the same cell. MCF7‐GREB1‐PP7‐Dual cells were grown at 10 pM E2. Two GREB1 transcription sites (red and yellow crosses) were tracked in nuclei (dashed line), and the intensity of both transcription sites was quantified (bottom). The zt‐kymograph demonstrates that both sites stay in focus during the movie. Scale bar: 5 μm. Also see Movie EV4.
Sister alleles have uncorrelated temporal fluctuations. Intensities of both transcription sites from the cell in panel (A) were plotted, with each dot representing one time point. The inset shows bootstrapped correlation coefficients for all 45 cells in panel (C).
Dual allele transcription in a cell population. Transcriptional activity was quantified in MCF7‐GREB1‐PP7‐Dual cells at 10 pM E2. The signal of two alleles from the same cell is represented as pair of rows, separated by a dark line. Cells are sorted for the total RNA output (ΣRNA) of the brighter allele from low (top) to high (bottom) as indicated on the right. The squared coefficient of variation across all alleles is shown.
Total RNA output of sister alleles is highly correlated. The relationship of the total transcriptional output between sister alleles for all cells in panel (C) (pairs). Randomly reassigned alleles from different cells do not show correlation (random). To ensure unbiased calculation of the correlation coefficient, the dataset was doubled with each allele within one cell once assigned as allele one and once as allele two, causing symmetry around the diagonal axis.
Simulations recapitulate transcriptional profiles of dual allele reporter cells. Sister alleles were simulated using two stochastic simulations with the same extrinsic noise realization with parameters yielded by the model fit (Fig 3) to the steady‐state 10 pM E2 dataset (t ON = 0.6 min; t OFF = 50 min; b = 10 RNAs/burst; model topology: 1–1–5).
Simulations incorporating extrinsic noise recapitulate correlation in transcriptional output between alleles. Sister alleles were simulated using the same assumptions and parameters as in (E). The transcription initiation and elongation rates were (1–1–5 model topology; red) or were not (1–1–0 topology; blue) resampled for each cell to include/exclude extrinsic noise. The simulated alleles were treated the same as the real data to calculate correlation coefficients.
Global model fit to datasets at different E2 levels reveals stimulus‐dependent frequency modulation. The distribution of summed particle distances when (1) fitting each E2 concentration with a separate parameter set (Fig 3) and variable model topologies (left); (2) fixing the model topology (1–1–5) and allowing all kinetic parameters to be different between E2 concentrations, (3) only allowing the OFF‐time to change locally with E2 concentration and keep all other parameters global and (4) only allowing the transcription initiation rate to change locally (right).
Scheme of selected global model: Two‐state model with burst frequency (i.e., OFF‐time) modulation by E2 and cell‐to‐cell variability in initiation rate and elongation kinetics.
Parameter posterior distributions. Boxplots show posterior distributions of the promoter OFF‐times obtained from the global SMC ABC fit with only the OFF‐time being a local parameter (left). Histograms depict global posterior distributions and mean value (dashed line) for burst size (middle) and ON‐time (right).
HDAC inhibitors have a dose‐dependent effect on transcriptional activity. MCF7‐GREB1‐PP7 cells were cultured with 100 pM E2 for 3 days, and then, butyrate (NaBu) or TSA was added for 4 h prior to fixation. Mean transcription sites intensities of > 5,000 cells per condition were calculated and fitted to a Hill function. Error bars denote standard deviation from two replicates. An intermediate butyrate concentration (*) was chosen for E2 titration experiments in panel (B).
Inhibitors of deacetylation alter the intensity of transcription sites. MCF7‐GREB1‐PP7 cells were cultured at various concentrations of E2 for 3 days, after which 2.5 mM butyrate (NaBu) was added for 4 h. Transcription sites were detected automatically and quantified using > 2,500 cells per condition. The mean and standard deviation from two biological replicates are plotted and fitted to a Hill function.
Intensity histograms of HDAC inhibitor titration reveal qualitative differences in noise behavior as compared to the E2 dose‐response. Kernel density estimates were calculated from spot intensities from high‐content imaging datasets (same data as in panel A and B). Comparison with stochastic simulations (panel D) indicates a decrease in burst size at low NaBu concentrations, as compared to the E2 dose‐response. Low NaBu doses primarily affect intensity but not the frequency, of the right peak in the histogram (see arrow).
Simulated intensity histograms distinguish burst size from burst frequency modulation. Stochastic simulations were performed (t ON = 1 min; b = 8 RNAs/burst; model topology: 1–1–5) with OFF‐times ranging from 10 (yellow) to 100 min (blue) and global intensity histograms were calculated. Both the proportion and intensity of active transcription sites changes with OFF‐time (top). Burst sizes were changed by increasing the initiation rate from 1 (blue) to 40 RNAs/min (yellow) during stochastic simulations (t ON = 1 min; t OFF = 30 min; model topology: 1–1–5) (bottom). Only the intensity of active transcription sites (x‐position of the right peak) changes with changing burst sizes.
The dose‐response to E2 shows an inverse noise–mean relationship. The measured intrinsic noise component (see Materials and Methods) is plotted as squared coefficient of variation against the mean expression level for each of eight datasets at various E2 concentrations. Dashed lines indicate the theoretical inverse noise–mean relation at fixed burst sizes, and changing burst frequencies (CV2 = burst size/meanRNA). Error bars represent standard deviation from bootstrapping.
Deacetylase inhibition reduces GREB1 output. Transcription was quantified in cells exposed to 20 pM E2 after 4 h of pre‐treatment with DMSO or butyrate (NaBu). Each line in the color maps represents an intensity trajectory from an individual cell. The total RNA output is plotted on the right with the CV2 indicated.
Global intensity distributions for the datasets in panel (B). A shift to lower expression levels is apparent upon butyrate treatment.
Deacetylase inhibition affects noise in nascent GREB1 transcription. The noise–mean relation is shown as in panel (A) for cells treated with 20 pM E2 or with solvent control (DMSO) or butyrate (NaBu). Butyrate leads to reduced GREB1 expression at a lower noise level as compared to an E2 concentration with similar mean (E2 titration follows dashed lines, see panel A).
Butyrate predominantly reduces the burst size. Distributions of parameter posteriors from SMC ABC fits to DMSO and NaBu datasets (B) are shown as boxplots (central line: median, box: 25 and 75% percentiles, whiskers: 5 and 95% percentiles). A low dose of butyrate only affects burst size, while a higher dose also increases OFF‐times.
References
-
- Battich N, Stoeger T, Pelkmans L (2015) Control of transcript variability in single mammalian cells. Cell 163: 1596–1610 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
