Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 10;22(1):78.
doi: 10.1186/s12915-024-01869-2.

Integrative analysis of transcriptomic and epigenomic data reveals distinct patterns for developmental and housekeeping gene regulation

Affiliations

Integrative analysis of transcriptomic and epigenomic data reveals distinct patterns for developmental and housekeeping gene regulation

Irina Abnizova et al. BMC Biol. .

Abstract

Background: Regulation of transcription is central to the emergence of new cell types during development, and it often involves activation of genes via proximal and distal regulatory regions. The activity of regulatory elements is determined by transcription factors (TFs) and epigenetic marks, but despite extensive mapping of such patterns, the extraction of regulatory principles remains challenging.

Results: Here we study differentially and similarly expressed genes along with their associated epigenomic profiles, chromatin accessibility and DNA methylation, during lineage specification at gastrulation in mice. Comparison of the three lineages allows us to identify genomic and epigenomic features that distinguish the two classes of genes. We show that differentially expressed genes are primarily regulated by distal elements, while similarly expressed genes are controlled by proximal housekeeping regulatory programs. Differentially expressed genes are relatively isolated within topologically associated domains, while similarly expressed genes tend to be located in gene clusters. Transcription of differentially expressed genes is associated with differentially open chromatin at distal elements including enhancers, while that of similarly expressed genes is associated with ubiquitously accessible chromatin at promoters.

Conclusion: Based on these associations of (linearly) distal genes' transcription start sites (TSSs) and putative enhancers for developmental genes, our findings allow us to link putative enhancers to their target promoters and to infer lineage-specific repertoires of putative driver transcription factors, within which we define subgroups of pioneers and co-operators.

Keywords: Developmental and housekeeping genes; Differentially and similarly expressed genes; Epigenomics; Gene regulation programs; Pioneer TFs; Transcriptional architecture.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
DEGs and SEGs, and corresponding TADs. A Gene expression (GE) distribution for each three lineages: for all genes (top), DEGs (middle) and SEGs (bottom). B Violin plots showing distances in base pairs to the nearest gene (TSS of gene to TSS of neighbouring gene) in a gene set depending on GE level: genes not depending on GE threshold (ALL GE, left), and genes with GE log (RPKM) > 2 (right). DEGs (blue, all three lineage-specific DEGs combined) are significantly further away from their neighbours than SEGs (orange, Mann-Whitney test, p < 0.01). C DEG- and SEG-only TADS do not differ in size (Mann-Whitney p = 0.79), but SEG-only TADS have a significantly higher gene density (computed and normalised in a 100-kB window) than DEG-only TADs (Mann-Whitney p = 0.021). D Upset plot showing content of TADs made up exclusively of genes expressed in just ectoderm, endoderm or mesoderm (DEGs, green), solely of similarly expressed genes (SEGs, yellow) and TADs whose content is an intersection of any two or three of the four sets. The dominance of coloured bars on the top left shows that the majority of TADs contain either DEGs or SEGs, with minimal intersections. E Hi-C interaction maps showing a typical DEG-only TAD (left) and a representative SEG-only TAD (right). The TAD on the left contains a single DEG (Foxa2), whereas the map on the right shows 16 SEGs sharing the same TAD. Genes are denoted by blue boxes, accessible chromatin by red boxes and known enhancers by green boxes. Long orange rectangle at the left plot shows the borders of the TAD, while the area under the interaction map on the right plot shows the whole SEG-containing TAD
Fig. 2
Fig. 2
Differential and similar chromatin-accessible regions (DARs and SARs) properties. A Distribution of DARs (blue, left) and SARs (red, right) relative to the TSS of DEGs (blue heatmap) and SEGs (red heatmap) in 5-kB bins. Heat maps show occurrences of DARs/SARs around each gene TSS. B Pie chart of genome-wide distribution of DARs and SARs. C Clustering of matched DARs (blue line, left heat map) around lineage-specific H3K27ac enhancers (green heatmaps), and SARs (red line, right heatmap) around H3K27ac for comparison. D (left) Distribution of DARs (blue line) around H3K4me3 peaks (violet heatmap); (right) distribution of SARs (red line) around H3K4me3 peaks (violet heatmaps). E An example of SEG-populated TAD and SARs within it, with SARs aligned with SEG gene’s promoters for 15 of the 16 promoters
Fig. 3
Fig. 3
Long-range correlation of gene expression and frequency of chromatin-accessible regions. A Low correlation of SAR frequency and average gene expression of SEG sets in a 80-kB TSS vicinity of SEGs (R = 0.08, p-value >0.05). B SARs and SEGs are not correlated across 400 kB, R < 0.15, p-value>0.05 over 400 kB. C Correlation of DARs frequency (Methods) and average gene expression of corresponding DEG gene sets in a 80-kB TSS vicinity of DEG’s TSS, R > 0.7, p  <0.05. D Zones of ‘influence’ (positive correlation of accessibility and gene expression) for DARs - DEGs, R > 0.7, p < 0.05. E Correlation of H3K27ac frequency (Methods) and average gene expression of corresponding DEG gene sets in a 80-kB TSS vicinity of DEGs TSS, R > 0.7, p < 0.05. F Zones of ‘influence’ (positive correlation of accessibility and gene expression) for H3K27ac - DEGs, R > 0.7, p < 0.05 for maximal correlation around 100 kB
Fig. 4
Fig. 4
Differentially and similarly DNA hypomethylated regions. A DhMRs relative to TSS DEGs (left, light blue heatmap) and ShMRs relative to TSS of SEGs (right, red heatmap). B DhMRs are clustered around DARs (left, light blue heatmaps); DhMRs are clustered around H3K27ac (middle, green heatmaps); ShMRs are clustered around SARs (right, orange heatmaps)
Fig. 5
Fig. 5
Inferring lineage-specific sets of driver TFs. A The upset plot for significantly enriched motifs (p < 0.001) within lineage-specific putative enhancers. Colours are for lineage-specific lists: green for ectoderm, blue for endoderm, pink for mesoderm. B TFs binding to lineage-specific enhancers: their TFBS motifs are most enriched (p < 0.001) within DEG’s lineage-specific putative enhancers, filtered by GE > 0 of their corresponding genes. The genes are expressed in their lineage and the corresponding motif is enriched in the lineage-specific regulatory regions. C Green, blue, pink coloured boxes within circles (DEGs) at the top contain pioneer driver DEG-produced TFs. The orange-rimmed boxes contain lineage-specific binding TFs (correspondingly coloured background), presumably cooperative TFs; their genes are expressed in all three lineages (SEGs). Coloured ovals denote distal putative enhancers with cis-regulatory motifs for corresponding TFs
Fig. 6
Fig. 6
Schematic diagram illustrating relative isolation of DEGs compared to the clustering of SEGs: A as a linear sequences; B as a two dimensional loops and three dimensional folding: Lineage-specific promoter-enhancer activation is indicated by colours (ectoderm—green, mesoderm—pink). Ovals (ectoderm—green, mesoderm—magenta) denote distal regulatory regions, crossed ovals are closed (chromatin inaccessible) enhancers. Genes are coloured elongated rectangles, promoter regions / TSS are grey squares. The grey-coloured circles with an oval at the centre represent a presumed radius of activation of a regulatory element and include the promoter region / TSS of its target gene(s). They correspond to the connecting arcs in the sequential representation (A) but demonstrate that linearly far away may be nearby in 2D

References

    1. Davidson EH. Emerging properties of animal gene regulatory networks. Nature. 2010;468(7326):911–20. doi: 10.1038/nature09645. - DOI - PMC - PubMed
    1. Lee K, Hsiung CC-S, Huang P, Raj A, Blobel GA. Dynamic enhancer-gene body contacts during transcription elongation. Genes Dev. 2015;29(19):1992–7. doi: 10.1101/gad.255265.114. - DOI - PMC - PubMed
    1. Furlong EEM, Levine M. Developmental enhancers and chromosome topology. Science. 2018;361(6409):1341–5. doi: 10.1126/science.aau0320. - DOI - PMC - PubMed
    1. Pope SD, Medzhitov R. Emerging principles of gene expression programs and their regulation. Mol Cell. 2018;71(3):389–97. doi: 10.1016/j.molcel.2018.07.017. - DOI - PubMed
    1. Arendt D, Musser JM, Baker CVH, Bergman A, Cepko C, Erwin DH, et al. The origin and evolution of cell types. Nat Rev Genet. 2016;17(12):744–757. doi: 10.1038/nrg.2016.127. - DOI - PubMed

LinkOut - more resources