Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 6;83(7):1140-1152.e7.
doi: 10.1016/j.molcel.2023.02.027. Epub 2023 Mar 16.

Synthetic regulatory genomics uncovers enhancer context dependence at the Sox2 locus

Affiliations

Synthetic regulatory genomics uncovers enhancer context dependence at the Sox2 locus

Ran Brosh et al. Mol Cell. .

Abstract

Sox2 expression in mouse embryonic stem cells (mESCs) depends on a distal cluster of DNase I hypersensitive sites (DHSs), but their individual contributions and degree of interdependence remain a mystery. We analyzed the endogenous Sox2 locus using Big-IN to scarlessly integrate large DNA payloads incorporating deletions, rearrangements, and inversions affecting single or multiple DHSs, as well as surgical alterations to transcription factor (TF) recognition sequences. Multiple mESC clones were derived for each payload, sequence-verified, and analyzed for Sox2 expression. We found that two DHSs comprising a handful of key TF recognition sequences were each sufficient for long-range activation of Sox2 expression. By contrast, three nearby DHSs were entirely context dependent, showing no activity alone but dramatically augmenting the activity of the autonomous DHSs. Our results highlight the role of context in modulating genomic regulatory element function, and our synthetic regulatory genomics approach provides a roadmap for the dissection of other genomic loci.

Keywords: CTCF; enhancers; gene regulation; genetic engineering; genome writing; stem cells; synthetic regulatory genomics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests R.B., J.D.B., and M.T.M. are listed as inventors on a patent application describing Big-IN. J.D.B. is a founder and Director of CDI Labs, Inc., a founder of and consultant to Neochromosome, Inc., a founder of, SAB member of, and consultant to ReOpen Diagnostics, LLC, and serves or served on the Scientific Advisory Board of the following: Logomix, Inc., Sangamo, Inc., Modern Meadow, Inc., Rome Therapeutics, Inc., Sample6, Inc., Tessera Therapeutics, Inc., and the Wyss Institute.

Figures

Figure 1.
Figure 1.. A synthetic regulatory genomics pipeline for investigation of Sox2 locus architecture.
(A) mESC engineering strategy. The BL6 allele of the 143-kb Sox2 locus (left) or the 41-kb Sox2 LCR (right) were replaced with LP-PIGA2 or LP-PIGA3, respectively. Landing pad integration was aided by Cas9 using a pair of gRNAs targeting both the replaced allele and the landing pad plasmid and by short homology arms that facilitate homology-directed repair. Landing pad mESCs were selected with puromycin, while ganciclovir (GCV) selects against integration of the landing pad vector backbone (BB). Cre recombinase-mediated cassette exchange (RMCE) enabled replacement of each landing pad with a series of payloads. Transfected cells were transiently selected with blasticidin, followed by counterselection of landing pad mESCs cells with proaerolysin. (B) Schematic of DNA assembly, mESC engineering, verification, and analysis pipelines. (C) Allele-specific qRT-PCR assay for Sox2 expression. BL6 and Castaneus (CAST) expression was measured using allele-specific primers in parental BL6xCAST mESCs, in mESCs with deletion of the 143-kb Sox2 locus (ΔSox2) or 41-kb deletion of the LCR (ΔLCR), as well as in BL6 MK6 mESCs. See also Figure S1.
Figure 2.
Figure 2.. Redundancy of proximal enhancers at the Sox2 locus.
(A) Schematic of the Sox2 locus. Shown are DNase-seq, CTCF ChIP-seq, reporter assay (STARR-seq and luciferase), and PRO-seq data from mESCs. Orientation of CTCF recognition sequences is indicated where applicable. (B) Sox2 expression analysis for payloads delivered to the Sox2 locus in LP-Sox2 mESCs. Blue rectangles demarcate genomic regions included in each payload. Each point represents Sox2 expression from the engineered allele in an independent mESC clone. Bars indicate median. Expression from the BL6 allele was normalized to the CAST allele and scaled between 0 (ΔSox2) to 1 (WT). See also Figure S2.
Figure 3.
Figure 3.. Quantifying the essentiality of Sox2 LCR DHSs.
(A) Schematic of the Sox2 LCR showing DNase-seq, CTCF ChIP-seq, reporter assay (STARR-seq and luciferase), and PRO-seq data from mESCs. (B) Sox2 expression analysis for payloads with deletions of single and multiple DHSs within the core LCR delivered to LP-LCR mESCs. Blue rectangles demarcate genomic regions included in each payload. Each point represents Sox2 expression from the engineered allele in an independent mESC clone. Bars indicate median. Expression from the BL6 allele was normalized to the CAST allele and scaled between 0 (ΔSox2) to 1 (WT LCR). See also Figures S2 and S3.
Figure 4.
Figure 4.. DHS function at the Sox2 LCR is context-dependent.
Sox2 expression analysis for selected LCR DHSs replacing the full LCR. Detailed payload structures are shown in Figure S3. Each point represents Sox2 expression from the engineered allele in an independent mESC clone. Bars indicate median. Expression from the BL6 allele was normalized to the CAST allele and scaled between 0 (ΔSox2) to 1 (WT LCR). (A-B) Sufficiency of minimal payloads. (C). Contribution of LCR DHSs in different DHS contexts computed as the difference in expression (ΔExpression) between pairs of payloads that differ solely by the presence of each focus DHS. The presence (green dots) or absence (gray dots) of each surrounding DHS is indicated; two copies of DHS24 are shown in red. Points indicate mean and bars indicate SD across all pairwise combinations of clones. Source data are from A-B and Figure 3B. (D) Summary of activity and context sensitivity of LCR DHSs. Overall activity was defined as the maximum and context-dependent activity as the range in C. (E-F) Effect of DHS orientation (E) and order (F). Expression of corresponding baseline payloads is repeated in blue for each payload. Arrows indicate extent of inversions in E. See also Figures S2 and S3.
Figure 5.
Figure 5.. Modeling the regulatory architecture of the Sox2 locus.
(A) Relative predictive performance of linear regression models for Sox2 expression. Models included features for DHS presence throughout the locus and orientation, order, and interaction terms for core LCR DHSs as indicated by blue dots. Model performance was measured by Bayesian Information Criteria (BIC) and presented as Fit (difference from maximum BIC). Red indicates the model with best fit. (B) Coefficients for the best model. See also Figure S4.
Figure 6.
Figure 6.. TF-scale dissection of LCR function.
(A) Genomic region surrounding the Sox2 LCR DHSs 23-26 showing DNase-seq and CTCF ChIP-seq data in mESCs and selected payloads. Gray rectangles indicate poly(C) and poly(T) sequences shortened in synthetic (syn) versions of DHSs 23 or 24. (B) Enlargement of engineered regions within DHS23 to DHS24 (left) and CTCF25 to DHS26 (right) showing DNase-seq windowed density and per-nucleotide cleavages, and ChIP-nexus data for selected TFs in mESCs. Payload schemes indicate surgical deletions of predicted TF recognition sites. Magenta triangles denote CTCF sites and their native orientation. Additional TF sites overlapping CTCF sites are shown in black. (C-F). Sox2 expression analysis for perturbations of TF recognition sequences within the core LCR. Payloads include deletions (Δ) of TF or CTCF recognition sequences shown in B. Each point represents Sox2 expression from the engineered allele in an independent mESC clone relative to baseline payloads (dashed vertical lines). Bars indicate median. Expression was scaled between 0 (ΔSox2) to 1 (WT LCR), and vertical gray lines indicate median expression of relevant baseline payloads. (C). Analysis of TF site deletions within DHS24. The presence (green dots) or absence (gray dots) of each TF site is indicated; mutations are shown in red. syn24.2 contains only a minimal region surrounding DHS24.2 (see B). syn24 (mut24.1-24.4) contains point substitutions that ablate TF recognition sequences 24.1-24.4 instead of deletions (Figure S5). (D) Contribution of TF sites 24.1-24.4 computed as the difference in expression (ΔExpression) between pairs of payloads differing solely by the presence of each focus TF recognition sequence. The presence (green dots) or absence (gray dots) of each surrounding TF site is indicated. Points indicate mean and bars indicate SD across all pairwise combinations of clones. Source data are from C. (E) Analysis of DHS23. (F) Analysis of CTCF25 and DHS26. 23-27 (ΔCTCF) has 8 CTCF sites surgically deleted. 23-27 (Divergent CTCF) has surgical inversion of 3 CTCF sites (TF sites 25.2, 26.1 and 26.5) so that all 9 CTCF sites lie in divergent orientation relative to Sox2. See also Figures S2, S5 and S6.

References

    1. Grosveld F, van Assendelft GB, Greaves DR, and Kollias G (1987). Position-independent, high-level expression of the human beta-globin gene in transgenic mice. Cell 51, 975–985. 10.1016/0092-8674(87)90584-8. - DOI - PubMed
    1. Evans T, Felsenfeld G, and Reitman M (1990). Control of globin gene transcription. Annual review of cell biology 6, 95–124. 10.1146/annurev.cb.06.110190.000523. - DOI - PubMed
    1. Li Q, Peterson KR, Fang X, and Stamatoyannopoulos G (2002). Locus control regions. Blood 100, 3077–3086. 10.1182/blood-2002-04-1104. - DOI - PMC - PubMed
    1. Whyte WA, Orlando DA, Hnisz D, Abraham BJ, Lin CY, Kagey MH, Rahl PB, Lee TI, and Young RA (2013). Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319. 10.1016/j.cell.2013.03.035. - DOI - PMC - PubMed
    1. Pott S, and Lieb JD (2014). What are super-enhancers? Nature Genetics 47, 8–12. 10.1038/ng.3167. - DOI - PubMed

Publication types

Substances