Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;32(3):479-490.
doi: 10.1038/s41594-024-01431-2. Epub 2024 Dec 16.

Integrative analysis of the 3D genome and epigenome in mouse embryonic tissues

Affiliations

Integrative analysis of the 3D genome and epigenome in mouse embryonic tissues

Miao Yu et al. Nat Struct Mol Biol. 2025 Mar.

Abstract

While a rich set of putative cis-regulatory sequences involved in mouse fetal development have been annotated recently on the basis of chromatin accessibility and histone modification patterns, delineating their role in developmentally regulated gene expression continues to be challenging. To fill this gap, here we mapped chromatin contacts between gene promoters and distal sequences across the genome in seven mouse fetal tissues and across six developmental stages of the forebrain. We identified 248,620 long-range chromatin interactions centered at 14,138 protein-coding genes and characterized their tissue-to-tissue variations and developmental dynamics. Integrative analysis of the interactome with previous epigenome and transcriptome datasets from the same tissues revealed a strong correlation between the chromatin contacts and chromatin state at distal enhancers, as well as gene expression patterns at predicted target genes. We predicted target genes of 15,098 candidate enhancers and used them to annotate target genes of homologous candidate enhancers in the human genome that harbor risk variants of human diseases. We present evidence that schizophrenia and other adult disease risk variants are frequently found in fetal enhancers, providing support for the hypothesis of fetal origins of adult diseases.

PubMed Disclaimer

Conflict of interest statement

Competing interests: B.R. is a cofounder and shareholder of Arima Genomics and Epigenome Technologies. A.A. is currently an employee of Meta. A.D.S. is an employee of Arima Genomics. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Characteristics of promoter-centered interactions identified from H3K4me3 PLAC-seq across 12 tissues and developmental stages during mouse fetal development.
a, Overview of the experimental design. b, Number of MAPS-identified chromatin interactions. c, The density plot of interaction distance across 12 tissues, each represented by distinct colors shown in b. d, Fraction of P2P and P2N interactions across 12 samples. e,f, Box plots showing the enrichment of promoter-interacting regions for accessible chromatins: histone marks of P2N interactions (e) and different chromatin states (f). A fold change of 1 is marked by the horizontal dashed line (n = 12). En–Sd, strong TSS–distal enhancer; En–W, weak TSS–distal enhancer; En–Pd, poised TSS–distal enhancer; Tr-S, strong transcription; Tr-P, permissive transcription; Tr-I, initiation transcription; Hc-P, Polycomb-associated heterochromatin; Hc-H, H3K9me3-associated heterochromatin. Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the range of (first quartile − 1.5 × (third quartile − first quartile)) to (third quartile + 1.5 × (third quartile − first quartile)). Two-tailed single sample t-test comparing to μ = 1, with FDR multiple comparison adjustment. *P ≤ 0.05, ***P ≤ 0.001 and ****P ≤ 0.0001. g, Average proportion of long-range interactions classified according to presence or absence of CTCF binding on one or both ends across 12 tissues (n = 12). A P2P interaction might be bound by CTCF at both ends (both), one of the two ends (one-sided) or neither end (neither). A P2N interaction might be bound by CTCF at both ends (both), only the promoter end (P-side only), only the nonpromoter end (N-side only) or neither end (neither). h, Average fraction of candidate interaction anchors involved in long-range interactions (n = 12). Anchors bound or not bound by CTCF are considered separately. Data are presented as the mean values ± s.d. Two-tailed paired t-test. i, A bar plot showing the number and the fraction of upstream (light cyan) and downstream (light blue) distal-interacting regions that form interactions with the promoter anchors bound by CTCF in forward, reverse or dual orientation in FB E12.5. P values were calculated using the chi-square test.
Fig. 2
Fig. 2. Tissue-to-tissue variability and developmental dynamics of promoter-centered interactomes.
a, PCA for the normalized PLAC-seq contact frequency. b, Heat map displaying normalized contact frequencies (left), gene expression of interacting promoters (middle left), H3K27ac distal peak signal (middle right) and promoter H3K4me3 signal (right) in P2N tissue-specific interactions. % of row total: individual tissue percentage of total sum across all tissues. c, Scatter plot between the difference of interaction number and the difference in gene expression. The red dashed line represents the fitted linear line. d, Interactions anchored at TSS regions around four genes (highlighted by yellow boxes) in tissues at E12.5. Black boxes above the refseq gene track mark the gene boundary of each anchored genes and the arrows on the top represent their transcription direction. Source data
Fig. 3
Fig. 3. Profiling cEnhancer–gene pairs in different fetal tissues.
a, Schematic for assigning cCREs to target genes. b, Density plot of SCCs between cCRE H3K27ac and interacting gene mRNA. c, Box plots of distributions for average phastCons scores for each nucleotide of all cCREs (d-TACs, with a chromatin-accessible peak present in any of the 12 samples where PLAC-seq data were generated), size-matched random regions (shuffled), interacting cCREs (cCREs involved in 91,451 cCRE–gene pairs), predicted enhancers from Supplementary Table 8c of Gorkin et al., all cEnhancers or cCREs with H3K27ac levels matching cEnhancers (matched H3K27ac). Two-tailed Wilcoxon rank-sum test: *P < 2.2 × 10−16. Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the range of (first quartile − 1.5 × (third quartile − first quartile)) to (third quartile + 1.5 × (third quartile − first quartile)). d, Validation rate of indicated elements tested for in vivo enhancer activity from the VISTA elements database. Chi-square test: **P < 0.001 and *P < 0.05. e, Heat map for chromatin features and expression of E12.5 cEnhancer–gene pairs. Pairs were k-means clustered by H3K27ac signal surrounding 2 kb at the center of cEnhancer. f, Top enriched GO biological process terms for genes from clusters in e. g, Example gene, Neurod1, showing correlated E12.5 tissue-specific H3K27ac signal at cEnhancer, gene expression and cEnhancer–gene interactions. The cEnhancer–gene interactions in this region from each E12.5 tissue are marked by arcs. The Neurod1 TSS region is highlighted by a yellow box and the cEnhancers are highlighted by blue boxes. h, Enriched known and de novo motifs at cEnhancers that have TF gene expression matching cEnhancer H3K27ac tissue patterns.
Fig. 4
Fig. 4. Chromatin interactions facilitate interpretation of noncoding disease risk variations in the human genome.
a, LDSC regression analysis to identify GWAS enrichments at cEnhancers from E12.5 clusters shown in Fig. 3e. b, Box plots showing distribution of H3K27ac counts at cEnhancers overlapping SCZ credible set SNPs. Two-tailed paired t-test between E12.5 FB and non-neural tissues. Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the range of (first quartile − 1.5 × (third quartile − first quartile)) to (third quartile + 1.5 × (third quartile − first quartile)). c, Heat map of gene expression for genes identified as interacting with cEnhancers overlapping SCZ credible set SNPs (n = 35); psychENCODE SCZ genes were marked by asterisk. Adult mouse cortex and cerebellum RNA-seq data obtained from ENCODE. d, GO analysis showing top enriched biological process terms of genes interacting with cEnhancers harboring SCZ credible set SNPs. e, SCZ example genes, Ascl1 and Emx1, interacting with loci harboring SCZ candidate SNPs. Tracks display H3K27ac, gene expression and all identified nearby cEnhancer–gene interactions. Loci are labeled with SNPs rs10860964 and rs62148129. Select images of mouse embryos displaying positive VISTA enhancer activity in the displayed genomic regions were obtained from VISTA database. All cEnhancer–gene pairs with a PLAC-seq interaction in any tissue or stage tested present in these genomic regions are marked by arcs.
Extended Data Fig. 1
Extended Data Fig. 1. Benchmark the quality of H3K4me3 PLAC-seq data.
a. A bar-plot showing the number of MAPS-identified interactions from each biological replicate with proportions of reproducible interactions highlighted by darker colors. b. Scatterplots showing the reproducibility of the normalized contact frequencies between two biological replicates across 12 tissue and stages. PCC: Pearson correlation coefficients. P-values from Pearson correlation test. c, d. Proportions of interactions identified from published Capture-C datasets using 99th (c) or 95th (d) percentile as thresholds that overlap with those identified from H3K4me3 PLAC-seq data using the closest tissues (See Methods). Pseudo-interactions that link H3K4me3-containing bin with bin equidistant from but on the other side of H3K4me3-containing bins were used as the control set. FL, forelimb. HL, hindlimb. MB, midbrain. E, embryonic day. e, g. Percent of interactions identified from published Capture-C datasets specifically existed in one tissue type or shared by both types of tissues under 10-kb resolution, using 99th (e) or 95th (g) percentile of the empirical distribution as thresholds. f, h. Proportions of tissue-specific Capture-C interactions (the orange part in e, g) that overlap with those identified from H3K4me3 PLAC-seq data from midbrain E12.5 or limb E12.5 from PLAC-seq. Comparison between closely-related tissues (limb vs. limb, midbrain vs midbrain) are colored in dark blue while comparison between non-closely-related tissues (limb vs. midbrain) are colored in light blue. **** Chi-square P-value < 2.2e-16.
Extended Data Fig. 2
Extended Data Fig. 2. Promoter-interacting regions are significantly enriched for accessible chromatin and specific histone marks across all 12 tissues.
a. Bar plots showing the percent of promoter-interacting bins overlapping the reproducible peaks identified from ATAC-seq, H3K27ac ChIP-seq, H3K4me1 ChIP-seq, H3K27me3 ChIP-seq or H3K9me3 ChIP-seq from the same tissue. Bins with equidistant from but on the other side of H3K4me3-containing bins were used as the control set, as shown in the schematic diagram on the top. b. Bar plots showing the fraction of base pairs in promoter-interacting bins overlapping the reproducible chromHMM state calls from the same tissue, categorized by enhancer (top), transcription (middle), and heterochromatin (bottom) status. Bins with equidistant from but on the other side of H3K4me3-containing bins were used as the control set, as shown in the schematic diagram on the top.
Extended Data Fig. 3
Extended Data Fig. 3. The role of CTCF in promoter-centered chromatin interactions.
a. Pie chart for percentage of CTCF peaks averaged across tissues falling within indicated genomic annotation. b. Boxplots showing enrichment of replicated CTCF peaks inside indicated genomic annotation. Shuffled regions across genome served as control set. Dashed line marks a fold-change of 1 (n = 12). Two-tailed single sample t-test comparing to mu = 1 with FDR multiple comparison adjustment, ‘ns’: not significant, *p <= 0.05, ***p <= 0.001, ****p <= 0.0001, c, d. Proportion of P2P (c) or P2N (d) interactions classified according to presence or absence of CTCF binding on one or both ends in each tissue or developmental stage. e, f. CTCF motif orientation on P2P (e) and P2N (f) interactions with both ends containing CTCF binding. (n = 12). P-value < 2 × 10−16 from chi-square test in all 12 tissues by comparing to expected fraction 0.25 for all 4 orientation combinations. g. Fraction of H3K4me3-containing 10-kb bins bound or not bound by CTCF that involved in long-range interactions in 12 tissues. Data are mean values ± SD. Chi-square test, ****p <= 0.0001. h. Bar plot showing number of CTCF peaks in candidate promoter anchor bins in 12 tissue/stage (N = 12). Data are mean values ± SD. i. Boxplots showing number of distal regions interacting with candidate promoter anchor bins with different numbers of CTCF peaks. ****p < 2 × 10−16 from ANOVA test in all 12 tissues. j. Bar plots showing number of upstream (light cyan) and downstream (light blue) distal-interacting regions that interact with promoter anchors bound by CTCF in forward, reverse or dual orientation in 12 tissues. Chi-square test. k. Boxplot showing enrichment of exons at CTCF peaks located at non-promoter sides of P2N interactions. Shuffled CTCF peaks limiting inside all non-promoter 10-kb bins of P2N interactions were used as control. Dashed line shows fold-change 1 (n = 12). Two tailed one-sample t-test comparing to mu = 1 (n = 12). For all boxplots: Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the range of (first quartile – 1.5× (third quartile – first quartile)) to (third quartile + 1.5 × (third quartile – first quartile)).
Extended Data Fig. 4
Extended Data Fig. 4. Tissue-specific chromatin interactions are associated with tissue-specific gene expression.
a, b. Heat maps showing the Pearson correlation coefficients and hierarchical clustering of H3K27ac ChIP-seq (a) and normalized contact frequencies from H3K4me3 PLAC-seq (b), with color bars representing different tissues. ce. Scatterplot between PLAC-seq normalized contact frequencies and relative gene expression (percent of total FPKM) (c), relative H3K27ac (percent of total log2CPM) (d) and relative H3K4me3 (percent of total log2CPM) (e) for interactions in Fig. 2b. f. Similar to Fig. 2c, scatterplot between the difference of the number of MAPS-identified significant chromatin interactions (x-axis) and the difference in gene expression (y-axis, merged by log2(FPKM + 1)), for all 21 pairs of 7 tissues at E12.5. The red dashed line represents the fitted linear line. All p-values are from two-sided Pearson or Spearman correlation test as indicated in figures.
Extended Data Fig. 5
Extended Data Fig. 5. Correlation between dynamic CTCF and interactions across different tissues.
a. Heat map showing the Pearson correlation coefficients and hierarchical clustering of CTCF ChIP-seq with color bars representing different tissues. b, c. Upset plots showing the numbers and the fractions of CTCF peaks shared by the different combinations of 7 tissues from embryonic day 12.5 (E12.5) (b) or 6 forebrain tissues from different timepoint (c). Only the top 10 combinations that contain the most CTCF peaks are showed in the plot. d. Heatmap displaying normalized contact frequencies, CTCF peak signal in the promoter and the non-promoter side of tissue-specific Promoter-to-Non-Promoter (P2N) interactions. CTCF are represented as individual tissue percentage of total sum across all 7 tissues. Tissue-specific interactions are clustered for being exclusive in all brain- and NT, LM, CF, or LV. e. Scatterplot between PLAC-seq normalized contact frequencies and relative CTCF (percent of total log2CPM) binding on the Promoter side of tissue-specific Promoter-to-Non-Promoter (P2N) interactions in d. f. Scatterplot between PLAC-seq normalized contact frequencies and relative CTCF (percent of total log2CPM) binding on the non-Promoter side of tissue-specific P2N interactions in d. All p-values are from two-sided Spearman correlation test as indicated in figures.
Extended Data Fig. 6
Extended Data Fig. 6. Dynamics of promoter-centered interactions across developmental stages in forebrain.
a. Heat map displaying normalized contact frequencies, gene expression of interacting promoters, H3K27ac distal peak signal and promoter H3K4me3 signal in Promoter-to-Non-Promoter (P2N) stage-specific interactions (left) as well as their average value across each cluster in each tissue (right). Gene expression and H3K27ac are represented as individual tissue percentage of total sum across all tissues. Stage-specific interactions are clustered for being exclusive in E12.5/E13.5, E14.5/E15.5, or E16.5/P0-only interactions. b. Scatterplot between PLAC-seq normalized contact frequencies and relative gene expression (percent of total FPKM) for Promoter-to-Non-Promoter (P2N) interactions in a. c. Scatterplot between PLAC-seq normalized contact frequencies and relative H3K27ac (percent of total log2 CPM) for P2N interactions in a. d. Scatterplot between PLAC-seq normalized contact frequencies and relative H3K4me3 (percent of total log2 CPM) for P2N interactions in a. e. Similar to Fig. 2c, scatterplot between the difference of the number of MAPS-identified significant chromatin interactions (x-axis) and the different of gene expression (y-axis, merged by log2(FPKM + 1)), for all 15 pairs of forebrain tissue at 6 developmental stages. The red dashed line represents the fitted linear line, suggesting that the change of significant chromatin interactions is positively correlated with the change of gene expression. All P-values from two-sided Spearman correlation test as indicated in figure. Source data
Extended Data Fig. 7
Extended Data Fig. 7. Chromatin interactions identify enhancer target genes.
a. Schematic illustrating analysis pipeline for determining interacting cCRE-gene pairs and cEnhancer-gene pairs. b. Spline fit lines of Spearman’s correlation coefficients between cCRE H3K27ac and assigned gene mRNA as a function of cCRE to TSS distance for two different pairing strategies. Degrees of freedom = 2, Two tailed two-sample T-test for each 200 Kb interval without multiple comparison adjustment. c. Histogram for number of target genes assigned to cEnhancers. d. Histogram for number of cEnhancers assigned to target genes. e. Density plot of cEnhancer to target TSS distances. f. Pie chart representing percentage of pairs where the cEnhancer targets the gene with the closest TSS or not. g. Distribution of H3K27ac log2 counts per million (CPM) ± 1 kb from center of elements from each group in Fig. 3c, d. P-values from two-tailed student t-test without multiple comparison adjustment. h. Expression and top enriched GO: Biological Process terms for genes interacting with VISTA enhancers with an identified promoter interaction in the same tissue where enhancer activity was reported positive in the VISTA database. Due to low numbers of liver positive VISTA enhancers, only one gene (Igf2) was assigned to this group and was omitted from the figure. Paired T-test, two-sided, without multiple comparison adjustment. For all boxplot in this figure: Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the rage of (first quartile – 1.5× (third quartile – first quartile)) to (third quartile + 1.5 × (third quartile – first quartile)).
Extended Data Fig. 8
Extended Data Fig. 8. Profiling candidate forebrain enhancer-gene pairs across developmental stages.
a. Elbow plots displaying sum of squared distances for indicated number of Kmeans clusters. Kmeans clustering was performed using normalized H3K27ac counts surrounding 2 Kb of cEnhancers of cEnhancer-gene pairs present in E12.5 tissues (left) and FB developmental stages (right). b. Heat map for chromatin features and expression of forebrain cEnhancer-gene pairs. Pairs were k-means clustered by H3K27ac signal surrounding 2 Kb at center of cEnhancer. c. Top enriched GO: Biological Process terms for genes of clusters from b. d. Example gene, Neurod6, showing correlated stage-specific cEnhancer H3K27ac, gene expression, and cEnhancer-gene interactions. The cEnhancer-gene interactions in this region from FB at different stages were marked by arcs. Neurod6 TSS region was highlighted by yellow box and the cEnhancers are highlighted by blue boxes. e. Enriched known and de novo motifs at cEnhancers that have TF gene expression matching cEnhancer H3K27ac patterns across stages.
Extended Data Fig. 9
Extended Data Fig. 9. SCZ-associated non-coding variants are enriched at fetal-specific enhancers.
a. Proportions of cEnhancer orthologs re-captured by CREs from two human fetal cortex datasets (Song et al. and Trevino et al.). Regions of equal size to the cEnhancer ortholog, with equidistant distance from but on the other side of the target gene were used as control. Chi-square test, **** p < 2.2e-16. b. Proportions of cEnhancer-gene orthologs supported by the two human datasets,. Pseudo-pairs linking the target genes with the control regions were used as controls. Chi-square test, **** p < 2.2e-16. c. Gene ontology analysis showing top enriched Biological Process terms of genes closest to cEnhancers harboring SCZ credible set SNPs by TSS proximity at 1D genomic distance. P-values from two-sided Fisher’s exact test. d. SCZ example genes, Ascl1 and Emx1, showing correlated tissue-specific cEnhancer H3K27ac, gene expression, and chromatin interactions at loci harboring SCZ SNPs.
Extended Data Fig. 10
Extended Data Fig. 10. Examples of interactions between SCZ example genes and loci harboring SCZ-candidate SNPs in human datasets.
IGV visualization of the human orthologs of cEnhancer-gene pairs of two SCZ example genes, Ascl1 (a) and Emx1 (b), under hg38 coordinate system. ATAC-seq signals from all 4 types of cells in Song et al. and ATAC-seq signals from 22 cell clusters in Trevino et al. are overlayed respectively and the corresponding ATAC-seq peaks are labelled on the bottom of each signal track.

References

    1. Kim, T. K. & Shiekhattar, R. Architectural and functional commonalities between enhancers and promoters. Cell162, 948–959 (2015). - PMC - PubMed
    1. Lee, T. I. & Young, R. A. Transcriptional regulation and its misregulation in disease. Cell152, 1237–1251 (2013). - PMC - PubMed
    1. Levine, M. Transcriptional enhancers in animal development and evolution. Curr. Biol.20, R754–R763 (2010). - PMC - PubMed
    1. Ong, C. T. & Corces, V. G. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat. Rev. Genet.12, 283–293 (2011). - PMC - PubMed
    1. Qiu, Y. et al. In situ saturating mutagenesis screening identifies a functional genomic locus that regulates Ucp1 expression. Phenomics1, 15–21 (2021). - PMC - PubMed