Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 2;49(21):12178-12195.
doi: 10.1093/nar/gkab1100.

Assessing genome-wide dynamic changes in enhancer activity during early mESC differentiation by FAIRE-STARR-seq

Affiliations

Assessing genome-wide dynamic changes in enhancer activity during early mESC differentiation by FAIRE-STARR-seq

Laura V Glaser et al. Nucleic Acids Res. .

Abstract

Embryonic stem cells (ESCs) can differentiate into any given cell type and therefore represent a versatile model to study the link between gene regulation and differentiation. To quantitatively assess the dynamics of enhancer activity during the early stages of murine ESC differentiation, we analyzed accessible genomic regions using STARR-seq, a massively parallel reporter assay. This resulted in a genome-wide quantitative map of active mESC enhancers, in pluripotency and during the early stages of differentiation. We find that only a minority of accessible regions is active and that such regions are enriched near promoters, characterized by specific chromatin marks, enriched for distinct sequence motifs, and modeling shows that active regions can be predicted from sequence alone. Regions that change their activity upon retinoic acid-induced differentiation are more prevalent at distal intergenic regions when compared to constitutively active enhancers. Further, analysis of differentially active enhancers verified the contribution of individual TF motifs toward activity and inducibility as well as their role in regulating endogenous genes. Notably, the activity of retinoic acid receptor alpha (RARα) occupied regions can either increase or decrease upon the addition of its ligand, retinoic acid, with the direction of the change correlating with spacing and orientation of the RARα consensus motif and the co-occurrence of additional sequence motifs. Together, our genome-wide enhancer activity map elucidates features associated with enhancer activity levels, identifies regulatory regions disregarded by computational prediction tools, and provides a resource for future studies into regulatory elements in mESCs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
FAIRE-STARR-seq in mouse embryonic stem cells. (A) Schematic representing the workflow of FAIRE-STARR-seq. (B) Heatmaps depicting normalized read distribution of the FAIRE-STARR input library, DNase- and ATAC-seq at the FAIRE-STARR input regions. (C) Correlation analysis of genome-wide read distribution, comparing the input library with DNase-seq data (ENCODE). Normalized and log1p transformed reads per 10 kb genomic bin are shown. (D) Heatmap showing normalized FAIRE-STARR-seq signal at active (4765) or inactive (182 194) input regions. (E) Exemplary genomic region encompassing the Pou5f1 gene. The FAIRE-STARR-seq signal merged from three replicates (normalized to input) is shown and the three active (weakly and highly active) regions detected in the depicted region, as well as one exemplary inactive region covered by our library, are highlighted. In addition, ChIP-seq data of histone modifications (HMs) as indicated, RNA-seq, ATAC-seq and input library signal from mESCs are shown. (F) Genomic distribution of input regions and FAIRE-STARR enhancers with respect to annotated Refseq genes. Promoters were defined as the regions 1 kb upstream of a TSS. (G) Motif enrichment analysis comparing the 4765 FAIRE-STARR active and an equal number of randomly sampled inactive input regions (mean E-values of ten subsamplings from inactive regions). Enrichment of motif clusters is indicated as −log10E-value and the -log ratio comparing active versus inactive enrichment is shown. Enriched motifs (E ≤ 1e−3) with a minimum 15-fold −log difference of E-values between the two groups are shown. The JASPAR 2018 vertebrate clustered motif database was used as reference and listed TF names display TF groups clustered by consensus motif similarity (57). (H) Anchorplots showing mean normalized ChIP-seq enrichment of the indicated HMs or TFs at FAIRE-STARR active or inactive input regions.
Figure 2.
Figure 2.
FAIRE-STARR-seq enables quantification of enhancer activity and activity level-associated sequence features. (A) FAIRE-STARR enhancers were ranked for their activity (log STARR score) and divided into five groups of ascending enhancer activity (highlighted by increasing background coloring). Dashed lines depict the 10th and 90th percentiles of STARR activity. (B) Expression of genes paired with FAIRE-STARR enhancers, for each of the five activity groups as depicted in (A). Genes were paired with FAIRE-STARR enhancers by distance using GREAT (84) and TPM values of RNA-seq data are shown. Boxplots depict the distribution of expression of all genes per group, whiskers extend to 1.5 IQR. TPM values of individual genes are shown as dots. P-values for unpaired Wilcoxon tests comparing neighboring groups are indicated. (C) TF sequence motifs enriched at active FAIRE-STARR enhancers, comparing the most active 10% (high) and least active 10% (low) of the active enhancers. Enriched motifs (E ≤ 1e−3) with a minimum 1-fold -log enrichment ratio between the two groups are shown. The JASPAR 2018 vertebrate clustered motif database was used as reference and a representative TF for each cluster is listed (57). Boxplots depicting (D) the number of significantly enriched motifs and (E) length of low- or high-ranking enhancers. Means are indicated as well as P-values for unpaired Wilcoxon tests comparing the two groups.
Figure 3.
Figure 3.
Functional mESC enhancers reside in different epigenomic environments. (A) FAIRE-STARR enhancers were clustered (k-means clustering) based on the enrichment of the eight investigated histone modifications. For each cluster, the STARR-, ATAC- and RNA-seq signals were plotted as were promoter regions (Prom) defined as 1 kb up- and downstream of the TSSs of annotated Refseq genes. Enhancer probability scores predicted by CRUP from mESC data are also shown. (B) Genes were assigned to enhancer clusters using GREAT and RNA-seq expression data is shown as dots for individual genes (TPM normalized) and as boxplots for each enhancer cluster. (C) Gene ontology analysis of genes associated with each enhancer cluster showing the fifteen most significant GO terms per cluster and their false discovery rate (-log10BionomFdrQ, cutoff 1e−03). For each ontology, the number of observed genes (ObsGenes), the significance, and the source of the assigned ontology are shown. (D) TF motif enrichment analysis (AME) for each enhancer cluster using the JASPAR 2018 vertebrate clustered motif matrices. TF motifs which were enriched (E ≤ 1e−5) for at least one cluster were clustered for TF occurrences applying wards clustering and Manhattan similarity measures. (E) Venn diagram showing the intersection of FAIRE-STARR and CRUP enhancers. FAIRE-STARR enhancers which overlap with CRUP enhancers (pos) or do not overlap (neg) were assigned to the HM clusters defined in (A).
Figure 4.
Figure 4.
Sequence-based prediction of active enhancers. (A) The regions of the FAIRE-STARR library were first divided by their overlap with ENSEMBL promoters (region up to 500 bp upstream of a TSS). Regions which overlap with promoters were used to generate an E-promoter prediction model, whereas those not overlapping were used for the enhancer prediction model. To this end both groups (B) putative enhancers and S4A) promoters) were ranked for their STARR score, and 1 or 10% highest or lowest ranking regions were used for model generation. (C) Cartoon depicting how the enhancer prediction model was trained on ranked regions from our FAIRE-STARR-seq analysis using enrichment of JASPAR 2018 vertebrate clustered motif matrices and region width as independent variables. 75% of the highest and lowest ranking regions were used for model training, while the remaining 25% were used for testing. (D) Plot shows model performance as receiver operating characteristic (ROC) curve for each of the outer cross-validation folds, mean ROC curve with area under the curve (AUC) and its standard deviation. (E) The 30 most predictive variables for the optimal enhancer prediction model and their coefficients are shown. Positive coefficients indicate a positive association with high STARR scores, while motifs with negative coefficients are associated with low-scoring elements.
Figure 5.
Figure 5.
Differentiation-associated changes in enhancer activity. (A) Treatment scheme to investigate how inducing differentiation of mESCs changes enhancer activity. (B) Mean FAIRE-STARR signal (top) and heat-maps (bottom) for LIF-dependent and RA-inducible STARR enhancers. (C) Number of differentially expressed genes (|log2FC| ≥ 1, Padj ≤ 0.05) paired with enhancers by distance using GREAT. (D) Differentially enriched TF motif clusters (JASPAR 2018 vertebrate clustered matrices) for RA-induced and LIF-dependent enhancers were identified using AME (E ≤ 1e−3, −logRatio ≥ 5). (E) Candidate FAIRE-STARR enhancers were cloned individually and assessed for enhancer activity by RT-qPCR targeting GFP reporter transcripts (normalized to Rpl19 and Actb). 20 h after transfection, E14 mESCs were treated for 4 h either with LIF, RA or ES medium only (none). Bar plots show normalized mean expression ± SE of three replicates (dots). (F) TF motif-matches identified by JASPAR scan (Supplementary Table S1) for enhancers as indicated were deleted by site-directed mutagenesis and activity was analyzed as described in (E). (G and H) Genomic loci encompassing STARR enhancers selected for genomic deletion using CRISPR/Cas9. Normalized FAIRE-STARR-input, -seq, and RNA-seq signals are shown. RefSeq genes are shown in either black (protein coding genes) or red (non-coding genes). (I and J) Expression of genes (RT-qPCR) near the deleted enhancer and of control genes for clonal wild type (wt) and enhancer heterozygous (E+/−) and homozygous (E−/−) deletion clones. Bar plots represent mean gene expression ± SE of three biological replicates (dots) and 1–3 clonal cell lines (number indicated in brackets) after 4 h of LIF or RA treatment. Data points for individual clonal lines are shown as dots with matching shapes.
Figure 6.
Figure 6.
RA-induced changes in enhancer activity at RARα-occupied sites correlate with specific sequence and chromatin features. (A) Genome browser view of an exemplary genomic region encompassing RA-inducible and non-inducible RARα binding sites. Normalized ATAC-, FAIRE-STARR and RNA-seq signals for LIF or RA treated cells and RefSeq genes for this region in either black (protein coding genes) or red (non-coding genes). (B) Distribution of changes in STARR activity (log2 fold change STARR score RA/LIF) and mean STARR activity (log score, for both treatments) of RARα-occupied regions that are covered in our FAIRE-STARR input library (as shown in Supplementary Figure S6A). Only regions with a minimum mean STARR activity ≥2.5 were included for further analysis. The 10% most induced, 10% most repressed and an equal number of regions that do not respond to RA treatment (non-resp.) were used for motif enrichment and TF binding analyses. (C) Enriched TF motif clusters (JASPAR 2018 clustered motif matrices) at induced, repressed, and non-responsive RARα-occupied sites. TF motif clusters with a maximum E-value of 1e−5 for at least one group and a log fold change ≥2 of induced or repressed over non-responsive regions are shown. Z-score normalization of E-values per row was performed. (D) Different spacings (0–8 nucleotides) and orientations (direct (DR), inverted (IR), and everted repeat (ER)) of the RARα::RXRα consensus motif (MA0159.1, upper panel, arrows highlight repeat orientation) were constructed in silico and used for motif enrichment analysis using AME. Only motifs which showed significant enrichment (E-value ≤ 1e−3) for at least one RARα binding site group are shown. Z-score normalization of E-values per row was performed. (E) Enhancer activity measured by STARR-RT-qPCR for selected spacing variants of the RARα::RXRα consensus motif (MA0159.1) and neighboring TF motifs as indicated (scr = scrambled motif) after 4 h of LIF or RA treatment. Bar plots depict the mean GFP expression + SE for three biological replicates. (F) Mean enrichment of RARα and H3K27ac as well as chromatin accessibility (ATAC) at induced, repressed, and non-responsive RARα-occupied sites.

Similar articles

Cited by

References

    1. Ernst J., Kheradpour P., Mikkelsen T.S., Shoresh N., Ward L.D., Epstein C.B., Zhang X., Wang L., Issner R., Coyne M.et al. .. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011; 473:43–49. - PMC - PubMed
    1. Dunham I., Kundaje A., Aldred S.F., Collins P.J., Davis C.A., Doyle F., Epstein C.B., Frietze S., Harrow J., Kaul R.et al. .. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012; 489:57–74. - PMC - PubMed
    1. Bulger M., Groudine M.. Functional and mechanistic diversity of distal transcription enhancers. Cell. 2011; 144:327–339. - PMC - PubMed
    1. Buecker C., Wysocka J.. Enhancers as information integration hubs in development: Lessons from genomics. Trends Genet. 2012; 28:276–284. - PMC - PubMed
    1. Heintzman N.D., Stuart R.K., Hon G., Fu Y., Ching C.W., Hawkins R.D., Barrera L.O., Van Calcar S., Qu C., Ching K.A.et al. .. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 2007; 39:311–318. - PubMed

Publication types

Substances