Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 12;14(1):5615.
doi: 10.1038/s41467-023-41265-y.

Multi-feature clustering of CTCF binding creates robustness for loop extrusion blocking and Topologically Associating Domain boundaries

Affiliations

Multi-feature clustering of CTCF binding creates robustness for loop extrusion blocking and Topologically Associating Domain boundaries

Li-Hsin Chang et al. Nat Commun. .

Abstract

Topologically Associating Domains (TADs) separate vertebrate genomes into insulated regulatory neighborhoods that focus genome-associated processes. TADs are formed by Cohesin-mediated loop extrusion, with many TAD boundaries consisting of clustered binding sites of the CTCF insulator protein. Here we determine how this clustering of CTCF binding contributes to the blocking of loop extrusion and the insulation between TADs. We identify enrichment of three features of CTCF binding at strong TAD boundaries, consisting of strongly bound and closely spaced CTCF binding peaks, with a further enrichment of DNA-binding motifs within these peaks. Using multi-contact Nano-C analysis in cells with normal and perturbed CTCF binding, we establish that individual CTCF binding sites contribute to the blocking of loop extrusion, but in an incomplete manner. When clustered, individual CTCF binding sites thus create a stepwise insulation between neighboring TADs. Based on these results, we propose a model whereby multiple instances of temporal loop extrusion blocking create strong insulation between TADs.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Multiple features of CTCF binding are enriched around TAD boundaries.
a Top: TAD boundary in mouse embryonic stem cells (mESCs) that appears as an extended transition zone. Top: intersection of the dashed lines indicates the TAD boundary as called from the minimum in the insulation score, whereas the solid arch demarcates the strong Hi-C signal that indicates intermingling around the boundary. In-between: insulation score (red dashed line: cut-off used for TAD calling). Bottom: zoomed-in CTCF and Rad21 ChIP-seq data showing clustered CBSs within the transition zone. Arrowheads indicate the orientation of the most significant CTCF binding motif in the CBSs. b Filtering of CTCF ChIP-seq peaks based on optimal CTCF binding motif discovery. Optimal enrichment is determined based on the maximum distance from a linear increasing density (dotted lines). P-values refer to motif significance score as determined using the MEME-suite (see Methods section). c Three features of CBSs are enriched close to TAD boundaries in mESCs. Top: CBSs sorted on distance from their nearest CBS relative to TAD boundaries. Middle: CBSs sorted on peak values relative to TAD boundaries. Bottom: CBSs sorted on covered binding motifs relative to TAD boundaries. d Ranking of identified CBSs based on peak value (left) and the corresponding number of motifs (running mean). e Number of CTCF binding motifs within CBSs.
Fig. 2
Fig. 2. Multiple binding motifs within CTCF binding sites contribute to CTCF binding.
a Removal of CTCF binding motifs within a CBS reveals that multiple motifs contribute to CTCF binding. Left: position of the 6 motifs within CBS 20326, with arrowheads indicating orientation and values indicating significance score. Below, motif deletions and the qPCR target are shown. Right: relative enrichment of CTCF binding as determined by ChIP-qPCR. Data are presented as mean values +/− SEM. Rings indicate individual measurements (n = at least 2 measurements each from 2 biological replicates). b Outline of the SLIM-ChIP assay to determine nuclease protection at CTCF motifs in WT mESCs. Top: SLIM-ChIP read-out, with black circles indicating protection at the CTCF binding motif. Bottom: normalized pile-up of SLIM-ChIP read start positions around all significant binding motifs in CBSs (200 bp window). The gray area overlaps the 15 bp motif where signal is reduced due to protection by CTCF binding. c Normalized pile-up of SLIM-ChIP signal (forward strand only, Supplementary Fig. 3a) for motifs sorted on their significance (different graphs) and their presence within different groups of CBSs (color coding in legend). Categories with fewer than 100 motifs have been excluded. P-values refer to motif significance score as determined using the MEME-suite (see Methods section). d Normalized pile-up of SLIM-ChIP signal for motif pairs that overlap with a 3 bp shift (encompassing ~6% of all motifs). Left: zoomed-in pile-up for all CTCF binding motifs. Middle and right: zoomed-in pile-up of the left and right motifs in overlapping pairs. Solid arrows highlight the two peaks with strong signal that surround the 15 bp binding motif. Dashed arrows highlight the 3 bp shift of the strong peaks, creating a 12 bp protected motif. Above, the impact of alternating CTCF binding on the left or right motif is indicated, which explains the presence of non-protected bases in both motifs, thereby confirming that they can both bind CTCF. e Summary of multi-feature enrichment of CTCF binding at strongly insulated TAD boundaries, as compared to sites elsewhere in the genome.
Fig. 3
Fig. 3. Development of Nano-C technology for multiplexed capture of 3C multi-contacts.
a Overview of the multi-contact Nano-C assay, which combines multiplexed capture of 3 C molecules containing pre-selected viewpoints, their linear amplification and their single-molecule direct-RNA sequencing. b Stringent three-step filtering of Nano-C reads to identify bona-fide chromatin interactions. c Captured pair-wise (1-way) and multi-way (≥2-way) contacts. d Genomic distribution of multi-way Nano-C contacts. Further analysis in this study is limited to the 25% of multi-contacts that have at least two contacts in the TADs directly surrounding the viewpoints.
Fig. 4
Fig. 4. Nano-C identifies and characterizes higher-order loops that structure TADs.
a 1D representation of multi-contacts. The Nano-C viewpoint is indicated with a black box and the identified contacts with red boxes. The gray line connects all contacts, thereby indicating the genomic interval that is spanned. b Nano-C multi-contacts in WT mESCs surrounding a central viewpoint within a large TAD with little CTCF binding. Violin plot indicates distances of up- and downstream interactions (white lines: median distances). Dashed purple lines indicate TAD boundaries. The viewpoint and Hi-C data are indicated on top. CTCF ChIP-seq data is indicated below. c Nano-C multi-contacts in 2 TADs surrounding a viewpoint that is close to a TAD boundary in WT (top, red) and Rad21-AID (bottom, blue) mESCs. Violin plots indicate distances of up- and downstream interactions of the viewpoint. The thick purple line indicates the boundary of interest and the thinner line a nearby boundary. Top: WT Hi-C data. In-between: WT CTCF and Rad21 ChIP-seq data, and Hi-C insulation score (red line: cut-off). d Shortest and longest distances within Nano-C multi-contacts in WT and Rad21-AID mESCs. Violin plots show the distance distributions for the combined 13 viewpoints (dashed white lines: median distance). Lines between violin plots connect pairs of shortest and longest distances within the same multi-contact. Number of data points and the Spearman correlation score are indicated in the top-left corner.
Fig. 5
Fig. 5. Nano-C confirms the stepwise contribution of individual CBSs to loop extrusion blocking and TAD insulation.
a Nano-C multi-contacts for three viewpoints in 2 TADs surrounding a boundary in WT mESCs. The color of the lines that connect multi-contacts indicates if all interactions are upstream of the boundary (blue), downstream of the boundary (orange) or on both sides (gray). Violin plots indicate distances of up- and downstream interactions (white lines: median distances). Viewpoints and Hi-C data are indicated above. ChIP-seq data for CTCF and Rad21, and Hi-C insulation score (red line: cut-off) are depicted below. The thick purple line indicates the boundary of interest and the thinner line a nearby boundary. Below, a zoom-in of CTCF binding in the transition zone surrounding TAD boundary is provided. b Distribution of Nano-C multi-contacts in the surrounding TADs. Scrambled distributions of multi-contacts were obtained after randomly assigning contacts up- and downstream into multi-contacts. Significance: G-test of independence. c Distribution of Nano-C multi-contacts in the surrounding TADs for viewpoints close to three other boundaries. Above, the position of the viewpoints and CTCF binding is indicated. Significance: G-test of independence. d Scenarios of Cohesin loading (left) and extruded loops (right) to explain multi-contacts that do not cross a TAD boundary (top) and that cross a TAD boundary (bottom, involving Cohesin readthrough). Nano-C viewpoints are indicated as black boxes and multi-contacts as red boxes.
Fig. 6
Fig. 6. Impact of CBS perturbations on loop extrusion blocking and TAD insulation.
a Overview of CTCF binding perturbations. Middle: CTCF binding (ChIP-seq) 100 kb up- and downstream of the TAD boundary in WT mESCs. Nano-C viewpoints (arrowheads) and the boundary (purple dashed line) are indicated as well. Above: CTCF binding (ChIP-seq) in the 5 kb surrounding the 4 CBSs directly surrounding the boundary in ΔCBS 20326 mESCs, where a single CBS is removed. Below, CTCF binding (ChIP-seq) surrounding the 4 CBSs in CTCF-AID mESCs, where CTCF is degraded after auxin treatment. Red crosses indicate absence of peaks. b Nano-C multi-contacts for a viewpoint in 2 TADs surrounding a boundary in WT, ΔCBS 20326 and CTCF-AID mESCs. The color of the lines indicates if all interactions are upstream of the boundary (blue), downstream of the boundary (orange) or on both sides (gray). Violin plots indicate distances of up- and downstream interactions of the viewpoint (white lines: median distances). The viewpoint is indicated above. The thick purple line indicates the boundary of interest and the thinner line a nearby boundary. c Distribution of Nano-C multi-contacts in the surrounding TADs in WT, ΔCBS 20326 and CTCF-AID cells (color-coding as in Fig. 6a). Violin plots indicate distances of up- and downstream interactions of the viewpoint. d Distribution of Nano-C multi-contacts in the surrounding TADs in WT, ΔCBS 20326 and CTCF-AID mESCs. Scrambled distributions of multi-contacts were obtained after randomly assigning contacts up- and downstream into multi-contacts. Significance: G-test of independence. e Setup of Oligopaint DNA-FISH analysis, with domains covered by probes indicated relative to CTCF binding. Crosses indicate the presence (black cross) or absence (red cross) of CTCF binding in ΔCBS 20326 and CTCF-AID mESCs. f Representative Oligopaint DNA-FISH images for the three mESC lines. Scale bar = 1 μm. g Difference in mean minimum distance between domain centroids for ΔCBS 20326 and CTCF-AID cells relative to WT cells. Shading refers to the pairs of domains that are analyzed. Significance: two-tailed Mann–Whitney test on pairwise distance distributions.
Fig. 7
Fig. 7. A modified RCL polymer model that incorporates dynamic and clustered CTCF binding improves the simulation of TAD boundary structure and function.
a 1D model for stepwise and non-permanent blocking of Cohesin-mediated loop extrusion by clustered CBSs. Different lines describe the temporal progression of loop extrusion for four Cohesin complexes. Individual CBSs can block extruding complexes, but only in a non-permanent and possibly incomplete manner (exemplified by traffic lights). Isolated CBSs are unable to create long-term blocking of loop extrusion, resulting in weaker insulation between domains. Clustering of multiple CBSs, each inducing temporal blocking, promotes prolonged Cohesin residence at the boundary. Through the continuous influx of Cohesin, this increases the density of the Cohesin complex, thereby creating strong insulation. b A modified Randomly Cross-Linked (RCL) polymer model to simulate stepwise and non-permanent loop extrusion blocking at TAD boundaries. Top: scheme depicting the bead-spring chain with added boundary components. Blue monomers belong to TAD 1, orange monomers to TAD 2, blue/orange monomers can belong to either TAD and gray monomers belong to the gap without connectors. Connectors are placed at random positions within the same TAD (green lines) or fixed at the boundaries (red lines). Bottom: in silico Hi-C map of the 100 monomers that surround the (average) boundary. c Relative insulation score for the RCL model with different combinations of added boundary components. Numbers and color coding refer to the added boundary components from Fig. 7b. The black line represents the average insulation score at all boundaries in mESCs, as determined from reanalyzed Hi-C data. The gray line represents the original RCL model without any added boundary components. Top: models with fixed connectors at the boundary; bottom: models without fixed connectors at the boundary. d Error estimates for the insulation score in the RCL model with different combinations of added boundary components, relative to experimental Hi-C data from mESCs (smaller values represent a smaller error). e An in silico RCL polymer model that incorporates all three aspects of dynamic and clustered CTCF binding results in an improved simulation of TAD boundary structure and function.

Similar articles

Cited by

References

    1. Rowley MJ, Corces VG. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 2018;19:789–800. - PMC - PubMed
    1. McCord RP, Kaplan N, Giorgetti L. Chromosome conformation capture and beyond: toward an integrative view of chromosome structure and function. Mol. Cell. 2020;77:688–708. - PMC - PubMed
    1. Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. - PMC - PubMed
    1. Nora EP, et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature. 2012;485:381–385. - PMC - PubMed
    1. Dowen JM, et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell. 2014;159:374–387. - PMC - PubMed

Publication types

Substances