Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun;55(6):1048-1056.
doi: 10.1038/s41588-023-01391-1. Epub 2023 May 8.

Region Capture Micro-C reveals coalescence of enhancers and promoters into nested microcompartments

Affiliations

Region Capture Micro-C reveals coalescence of enhancers and promoters into nested microcompartments

Viraat Y Goel et al. Nat Genet. 2023 Jun.

Abstract

Although enhancers are central regulators of mammalian gene expression, the mechanisms underlying enhancer-promoter (E-P) interactions remain unclear. Chromosome conformation capture (3C) methods effectively capture large-scale three-dimensional (3D) genome structure but struggle to achieve the depth necessary to resolve fine-scale E-P interactions. Here, we develop Region Capture Micro-C (RCMC) by combining micrococcal nuclease (MNase)-based 3C with a tiling region-capture approach and generate the deepest 3D genome maps reported with only modest sequencing. By applying RCMC in mouse embryonic stem cells and reaching the genome-wide equivalent of ~317 billion unique contacts, RCMC reveals previously unresolvable patterns of highly nested and focal 3D interactions, which we term microcompartments. Microcompartments frequently connect enhancers and promoters, and although loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. We therefore propose that many E-P interactions form through a compartmentalization mechanism, which may partially explain why acute cohesin depletion only modestly affects global gene expression.

PubMed Disclaimer

Figures

Extended Data Figure 1.
Extended Data Figure 1.. RCMC efficiently and reproducibly captures ligated dinucleosomal fragments, giving rise to deep contact maps.
(a) Representative MNase titration DNA gel indicating the ideal level of digestion by MNase, based on the ratio of fragment sizes, for the RCMC protocol. (b) Representative size selection gel for the RCMC protocol showing the dinucleosomal band that is extracted to obtain ligated fragments. (c) Overview of the capture probe design workflow for RCMC. 80-mer probes tiling the region of interest are designed, removing those which overlap highly repetitive regions. (d) Summary of the capture efficiency for each of the five regions for which probes were designed. The locations and sizes of the regions, the number of ligated fragments which mapped at single loci at both ends in total and in the region, and the capture efficiencies are given. Because different capture probe sets were used for Biological Replicates 1 (two separate sets of capture probes) and 2 (simultaneous capture for all five loci), numbers are separately provided for each Biological Replicate. (e) Contact maps comparing raw, unbalanced data (upper panel, lower triangle), ICE-balanced to all aligned reads (lower panel, lower triangle) and ICE-balanced to reads in Capture loci only (both panels, upper triangle). Balancing only to data entirely within Capture loci was necessary to remove artifacts due to capture bias. (f) Contact maps comparing the entire Fbn2 TAD in RCMC and in Hi-C31 and Micro-C13. Gene annotations and ChIP-seq signal tracks are shown below the contact maps. (g) Measurement of reproducibility between wild-type replicates across all five Capture loci, with reproducibility scores determined using HiCRep at 10 kb resolution, clustered according to similarity. Three technical RCMC replicates (denoted by “TR#”) comprise Biological Replicate 1, while “BR2” denotes Biological Replicate 2. TR3_WT is noted in blue text at the Sox2 and Nanog loci because very little TR3_WT pre-Capture library remained for input to Sox2 & Nanog Capture after the initial Ppm1g, Klf1 and Fbn2 Capture experiment; accordingly, relative to all other replicates, TR3_WT has much lower sequencing depth (0.5-2.4% the number of unique contacts) at the Sox2 & Nanog loci.
Extended Data Figure 2.
Extended Data Figure 2.. Benchmarking of RCMC against other 3C methods.
(a) Contact probability curves comparing RCMC against the highest resolution Tiled-Micro-Capture-C (TMCC)17, Micro-C13, and Hi-C31 mESC datasets across contact distances. (b) Benchmarking comparison of RCMC’s ability to fill out high-resolution contact matrices against TMCC17, Micro-C13, and Hi-C31. Region-averaged calculations are shown for all methods, and calculations for individual Captured regions are also shown for RCMC and TMCC. The x-axis shows the contact distance in bp, and the y-axis shows the fraction of all bins at a given contact distance within the Captured locus that contain at least one read at 100 bp resolution. (c) Summary of read counts across RCMC, TMCC17, Micro-C13, and Hi-C31. The number of mapped sequencing reads, the fraction of unique reads, and the fraction of structurally informative (defined as cis contacts >=1 kb) unique reads are given for each method. Two versions of quantification are provided for TMCC. In black are numbers processed using the same bioinformatic pipeline as for RCMC. Capture region-specific quantifications (defined here as all reads with at least one of two read mates mapped to the locus) are also provided for all RCMC loci and the Sox2 and Nanog TMCC loci; the Oct4 and Prdm14 TMCC loci are not considered in this manuscript. In blue are numbers kindly provided by Dr. A Marieke Oudelaar, obtained using the custom TMCC-specific bioinformatic pipeline from Aljahani et al.. Values with asterisks denote quantifications of all unique contact pairs mapped to Captured loci (not filtered to be >= 1 kb in size). (d) Contact map comparisons of RCMC data generated in this manuscript, starting from the full dataset (topmost) and successively downsampled by orders of two down to 1/128th of the data (bottommost), shown for the Klf1 locus at 500 bp resolution. (e) As in (b), benchmarking comparison of successively downsampled RCMC’s ability to fill out high-resolution contact matrices against Micro-C13 at the Klf1 locus. (f) Contact map comparisons of 1/64th and 1/128th downsampled RCMC (left) against the highest-resolution available mESC Micro-C13 (right; Hsieh 2020) dataset, shown for the Klf1 locus at 500 bp resolution.
Extended Data Figure 3.
Extended Data Figure 3.. RCMC generates deeper contact maps than other 3C methods across all 5 Captured loci.
Contact map comparisons of RCMC against the highest-resolution available mESC Hi-C31 (top; Bonev 2017) and Micro-C13 (bottom; Hsieh 2020) datasets at the Klf1, Ppm1g, Sox2, Nanog, and Fbn2 loci. Full Capture regions are shown for each locus at resolutions ranging from 1-5 kb, as well as Klf1 and Ppm1g zoom-ins at 800 and 1000 bp, respectively. Gene annotations and ATAC, ChIP-seq, and RNA-seq tracks (Supplementary Table 1) are shown below the contact maps, while the contact intensity scales are shown next to the maps.
Extended Data Figure 4.
Extended Data Figure 4.. RCMC maps the Sox2 locus more deeply and efficiently than sister methods, uncovering previously unseen interactions.
(a) Contact map comparisons of RCMC against Hi-C31 (top) and Micro-C13 (bottom) at the Sox2 locus at 1.6 kb resolution. Arrows mark contacts between Sox2, the SCR, and Fxr1 not mapped by Hi-C and Micro-C. (b) Contact map comparisons of RCMC against Tiled-Micro-Capture-C17 (TMCC) across the whole TMCC-Captured locus (left, 1600 bp resolution) and in the Sox2 and SCR regulatory cluster (right, 500 bp resolution). Full datasets are visualized in the top contact maps, and TMCC has been downsampled to match the total number of RCMC sequencing reads in view in the bottom contact maps.
Extended Data Figure 5.
Extended Data Figure 5.. RCMC identifies microcompartments, which are not visible in other methods and not reliably called by existing algorithms.
(a-b) Contact maps comparison of RCMC (top) against Hi-C31 (bottom, a) and Micro-C13 (bottom, b) at the Klf1 locus at 500 and 250 bp resolutions and at the Ppm1g locus at 1000 and 250 bp resolutions. (c) Contact maps of the Klf1 and Ppm1g loci at 1 kb resolution with loop calls by Mustache overlaid on the bottom half of the map and compartment calls by cooltools, shown below the map. (d) Contact maps of the entire Klf1 (3.2 kb resolution) and Ppm1g (5 kb resolution) Captured loci with manually called loops (see Methods) overlaid on the bottom halves of the maps.
Extended Data Figure 6.
Extended Data Figure 6.. Microcompartments are not artifacts resulting from incomplete ICE balancing nor chromatin accessibility.
(a) Comparison of ICE balancing across methods and Capture loci. Distributions of the sums of ICE-balanced contact matrix rows at 250 bp resolution are shown at the Klf1, Ppm1g, Fbn2, and Sox2 loci for RCMC, Micro-C13, and Hi-C31, as well as for the subset of RCMC rows containing microcompartment anchors. A sharp unimodal peak is consistent with ICE’s baseline assumption that all contact matrix rows and columns must sum to the same value. (b) Metaplots (above) and heatmaps (below) depicting ATAC signal at microcompartment anchors (left, separated by whether anchors coincide with an ATAC peak) and at all ATAC peaks in the Klf1 and Ppm1g Capture loci (right, separated by whether peaks coincide with a microcompartment anchor). Signals are plotted in a 2 kb window centered on the anchor (left) or the ATAC peak (right). (c) RCMC contact maps at the Klf1 (left, 250 bp resolution) and Ppm1g (right, 1.6 kb resolution) indicating ATAC peaks that do not form microcompartments (left, magenta arrows) and a microcompartment anchor that does not coincide with an ATAC peak (right, cyan arrow). Black arrows (right) indicate microcompartmental loops involving the ATAC-negative microcompartment anchor. (d) Venn diagram breakdown of the overlap between all manually annotated microcompartment anchors and all ATAC peaks across the Klf1 and Ppm1g Capture loci. Of 132 annotated microcompartment anchors, 12 do not coincide with ATAC peaks (cyan) while 120 do (purple, *). Of 353 called ATAC peaks, 187 do not form microcompartment anchors (magenta) while 166 do (purple, **). The apparent discrepancy of 120 microcompartment anchors being anchored by 166 ATAC peaks is due to two close ATAC peaks occasionally anchoring a single microcompartment.
Extended Data Figure 7.
Extended Data Figure 7.. Categories of microcompartment anchors can be defined by their chromatin features.
Metaplots (above) and heatmaps (below) depicting ATAC, ChIP-seq, and RNA-seq (Supplementary Table 1) signal at microcompartment loop anchors for classes of microcompartment anchors as defined in Fig. 3e. Features are plotted in a 2 kb window centered on the anchor.
Extended Data Figure 8.
Extended Data Figure 8.. Cohesin depletion disrupts CTCF/Cohesin loops, but generally not most microcompartmental loops.
(a) Contact maps comparing a DMSO control (above) and RAD21-depleted samples (below) are shown for the Klf1, Ppm1g, Sox2, Nanog, and Fbn2 loci at resolutions spanning 800 bp – 5 kb in F1M RAD21-mAID-BFP-V5 mESCs,. Arrows mark contacts lost upon RAD21 depletion. ChIP-seq data from Hsieh et al., bioRxiv (2021) is shown below the maps before and after the IAA treatment (500 μM, 3 hours). Two versions of the Fbn2 locus are shown, with the left using logarithmic contact frequency scaling and the right using linear scaling. Loss of the Fbn2 loop is most clearly seen on linear scale. (b) Contact probability curves comparing RAD21-depleted RCMC samples against a DMSO control (top) RAD21-depleted Micro-C samples against a DMSO control (bottom). Arrows indicate the contact frequency “bump” lost upon RAD21 depletion.
Extended Data Figure 9.
Extended Data Figure 9.. Inhibition of transcription does not significantly alter genome organization in Captured loci.
Contact maps comparing control data against 45 min (top) and 4 hr (bottom) transcriptional inhibition data (from 1 μM triptolide treatments) are shown for the Klf1, Ppm1g, Sox2, Nanog, and Fbn2 loci at resolutions spanning 800 bp – 5 kb in mESC WT cells. RNA Pol II ChIP-seq data is shown below the maps for each treatment condition.
Extended Data Figure 10.
Extended Data Figure 10.. Microcompartment-like structures are also visible in ultra-deep Hi-C data.
(a-d) Contact maps of ultra-deep Hi-C data in human lymphoblastoid cells (Gu et al., bioRxiv (2021) showing loci with structures sharing many microcompartmental features. Maps were generated using Juicebox’s web interface kindly provided by Dr. Jordan Rowley. Maps are shown at 1 kb resolution, with GM12878 gene annotations, CTCF (ENCFF364OXN) and H3K27ac (ENCFF180LKW) ChIP-seq, and RNA-seq (ENCFF604VIC) signal tracks shown below the contact maps.
Figure 1.
Figure 1.. Region Capture Micro-C captures chromosome conformation at nucleosome resolution.
(a) Overview of the Region Capture Micro-C (RCMC) protocol. Cells are chemically fixed, nuclei are digested with micrococcal nuclease (MNase), and fragments are biotinylated, proximity ligated, dinucleosomes gel extracted and purified, library prepped, PCR amplified, and region-captured to create a sequencing library. After sequencing, mapping, and normalization, the data is visualized as a contact matrix. (b) Benchmarking comparison of RCMC against the highest resolution Tiled-Micro-Capture-C (TMCC)17, Micro-C13, and Hi-C31 mESC datasets. Region-averaged calculations are shown for RCMC, TMCC, Micro-C, and Hi-C, and calculations for individual captured regions are also shown for RCMC and TMCC. The x-axis shows the fraction of all reads that uniquely map to the target region (both read mates fall within the Captured region) that are structurally informative (cis contacts >=1 kb). The y-axis shows the fraction of all contact bins separated by 10 kb that contain at least one read at 100 bp resolution.
Figure 2.
Figure 2.. RCMC generates deep contact maps, reveals previously unresolved aspects of 3D genome structure, and outperforms other 3C methods.
(a-b) Contact map comparison of RCMC against the deepest available mESC Hi-C (top; Bonev 201731) and Micro-C (middle; Hsieh 202013) datasets at the (a) Sox2 and (b) Klf1 regions at 500 bp resolution. Gene annotations and ATAC, ChIP-seq, and RNA-seq (see Supplementary Table 1) signal tracks are shown below the contact maps, while the contact intensity scale is shown to the right. The RCMC data shown throughout this manuscript were pooled from two biological replicates in wild-type mESCs. (c) Contact map comparison of RCMC against Tiled-Micro-Capture-C (TMCC)17 at the Nanog locus at 250 bp resolution. Full datasets are visualized in the top contact map, and TMCC has been downsampled to match the total number of RCMC sequencing reads in view in the bottom contact map.
Figure 3.
Figure 3.. RCMC identifies highly nested, focal interactions called microcompartments which frequently connect enhancers and promoters.
(a-b) Contact map visualization of RCMC data and called microcompartments at the Klf1 (a) and Ppm1g (b) locus at 500 bp (a) and 1 kb (b) resolution (left) and 250 bp resolution (zoom in, right). Manually annotated microcompartment contacts are shown below the contact map diagonal on the left, while comparisons against genome-wide Micro-C13 (a) and Hi-C31 (b) are shown on the right. (c-d) Histograms showing distributions of (c) the number of focal interactions formed by microcompartment anchors and (d) the lengths spanned by focal interactions in kb. (e) Venn diagram of microcompartment anchor categories according to chromatin features overlapped by the anchor ±1 kb. Promoters were defined as regions around annotated transcription start sites ±2 kb, enhancers as regions with overlapping peaks of H3K4me1 (ENCFF282RLA) and H3K27ac (GSE90893) in ChIP-seq data which did not overlap promoters, and CTCF/cohesin as regions with overlapping peaks of CTCF (GSE90994) and SMC1A (GSE123636) in ChIP-seq data. Other regions are those not overlapping any of these features. (f) Swarm plot of the number of focal interactions formed by individual microcompartment anchors divided according to categories in (e), including the mean (μ) and median (Med) for each distribution. Anchors fitting into more than one category were excluded. (g) Fractions of loops classified into different categories: P-P (promoter-promoter), E-P (enhancer-promoter), CTCF-CTCF (CTCF/cohesin-CTCF/cohesin), other (Other-Other interactions, or any other combinations). CTCF-CTCF interactions do not include any anchors which overlap promoter or enhancer regions.
Figure 4.
Figure 4.. Most microcompartments are robust to the loss of loop extrusion.
(a) Cohesin (RAD21) depletion does not strongly perturb most microcompartments. Left: Treatment paradigm for rapid depletion of RAD21 upon IAA treatment in clone F1M RAD21-mAID-BFP-V5 mESCs,. Right: Contact maps comparing DMSO-treated control (above) and RAD21-depleted (below) samples are shown for the Klf1 and Ppm1g loci. (b) Western blot showing near-complete (97%) depletion of RAD21 following 3 hours of IAA treatment. This Western blot was performed once using cells collected simultaneously for RCMC. (c) Aggregate peak analysis matrix of called microcompartmental contacts after RAD21 depletion compared to their respective controls, separated by the identity of each contact’s constituent anchors. Plots show a 20 kb window centered on the loop at 250 bp resolution. The background-normalized intensity for a 1250x1250 bp box around the central dot for each aggregate peak is shown in the upper right of each plot as a quantification of aggregate dot strength. (d) Plot of individual microcompartment strengths (as quantified in (c)) in the RAD21-depleted (y-axis) and control (x-axis) conditions, shown for P-P (purple, n=418), E-P (pink, n=238), and E-E (gray, n=40) loops. Interactions changing in strength by two-fold or more are visualized as x’s, while interactions below the threshold are visualized as circles and percentages are noted. (e) Zoomed-in contact maps of microcompartment examples in (a) that strengthen (i) or weaken (ii,iii) relative to the control treatment and the background in response to RAD21 depletion.
Figure 5.
Figure 5.. Most microcompartments are robust to the inhibition of transcription.
(a) Inhibition of transcription initiation with triptolide does not strongly affect most microcompartments. Left: Overview of triptolide treatment for WT mESCs (45 min or 4 hr). Right: Contact maps comparing WT control (above) and transcriptionally-inhibited (below) samples are shown for the Klf1 locus (45 min timepoint shown vs control) and the Ppm1g locus (4 hr timepoint shown vs control). RNA Pol II ChIP-seq data (RPB1) is shown below. (b) Aggregate RPB1 RNA Pol II ChIP-seq signal at genes after triptolide treatment (45 min and 4 hr) and a control (WT). The x-axis depicts all unique mouse genes normalized by length and flanked by 3 kb upstream and downstream of their TSS and TES, respectively. The first 500 bp downstream of the TSS (marked by the second x-axis tick mark) are not normalized to avoid normalizing the core promoter against variable gene body lengths. (c) Left: Contact maps comparing the transcriptional inhibition timepoints (45 min treatment above, 4 hr treatment below) are shown for the Klf1 locus. Right: Zoomed-in contact maps of microcompartments across the control and triptolide treatment timepoints that weaken (i) or strengthen (ii,iii) in response to transcriptional inhibition. (d) Plot of individual microcompartment strengths in the transcriptionally inhibited (y-axis) and control (x-axis) conditions, shown for P-P (purple, n=418), E-P (pink, n=238), and E-E (gray, n=40) loops. Interactions changing in strength by two-fold or more are visualized as x’s (percentages noted), and as circles otherwise. (e) Aggregate peak analysis matrix of called microcompartmental contacts across the two transcriptional inhibition timepoints compared to the control, separated by the identity of each contact’s constituent anchors. Plots show a 20 kb window centered on the loop at 250 bp resolution, with background-normalized dot intensities shown in the upper right of each plot. (f) Proposed model for the formation of microcompartments. Coalescence of multiple promoters and enhancer elements in a gene-dense region may occur through A/B-block copolymer microphase separation, resulting in variable combinations of multiway interactions being present in different cells and giving rise to tessellated focal interactions in population-averaged RCMC data.

References

    1. Dekker J et al. The 3D Genome as Moderator of Chromosomal Communication. Cell (2016). doi:10.1016/j.cell.2016.02.007 - DOI - PMC - PubMed
    1. Oudelaar AM et al. The relationship between genome structure and function. Nat. Rev. Genet 22, 154–168 (2021). - PubMed
    1. Lieberman-Aiden E et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science (80-. ). 326, 289–293 (2009). - PMC - PubMed
    1. Nuebler J et al. Chromatin organization by an interplay of loop extrusion and compartmental segregation. Proc. Natl. Acad. Sci 115, E6697–E6706 (2018). - PMC - PubMed
    1. Dixon JR et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012). - PMC - PubMed

Methods References

    1. Navarro Gonzalez J et al. The UCSC Genome Browser database: 2021 update. Nucleic Acids Res. 49, D1046–D1057 (2021). - PMC - PubMed
    1. Yang T et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 27, 1939–1949 (2017). - PMC - PubMed
    1. Venev Sergey, Abdennur Nezar, Goloborodko Anton, Flyamer Ilya, Fudenberg Geoffrey, Nuebler Johannes, Galitsyna Aleksandra, Akgol Betul, Abraham Sameer, Kerpedjiev Peter, & M. I. open2c/cooltools: v0.4.1 (v0.4.1). Zenodo 10.5281/zenodo.5214125 (2021). - DOI
    1. Abdennur N et al. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36, 311–316 (2020). - PMC - PubMed
    1. Robinson JT et al. Juicebox.js Provides a Cloud-Based Visualization System for Hi-C Data. Cell Syst. 6, 256–258.e1 (2018). - PMC - PubMed

Publication types