Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;56(12):2827-2841.
doi: 10.1038/s41588-024-02000-5. Epub 2024 Nov 25.

ChIP-DIP maps binding of hundreds of proteins to DNA simultaneously and identifies diverse gene regulatory elements

Affiliations

ChIP-DIP maps binding of hundreds of proteins to DNA simultaneously and identifies diverse gene regulatory elements

Andrew A Perez et al. Nat Genet. 2024 Dec.

Abstract

Gene expression is controlled by dynamic localization of thousands of regulatory proteins to precise genomic regions. Understanding this cell type-specific process has been a longstanding goal yet remains challenging because DNA-protein mapping methods generally study one protein at a time. Here, to address this, we developed chromatin immunoprecipitation done in parallel (ChIP-DIP) to generate genome-wide maps of hundreds of diverse regulatory proteins in a single experiment. ChIP-DIP produces highly accurate maps within large pools (>160 proteins) for all classes of DNA-associated proteins, including modified histones, chromatin regulators and transcription factors and across multiple conditions simultaneously. First, we used ChIP-DIP to measure temporal chromatin dynamics in primary dendritic cells following LPS stimulation. Next, we explored quantitative combinations of histone modifications that define distinct classes of regulatory elements and characterized their functional activity in human and mouse cell lines. Overall, ChIP-DIP generates context-specific protein localization maps at consortium scale within any molecular biology laboratory and experimental system.

PubMed Disclaimer

Conflict of interest statement

Competing interests: M.G., A.A.P., M.R.B., I.N.G. and J.K.G. are inventors of a submitted patent covering the ChIP-DIP method. The other authors declare no competing interests.

Figures

Extended Data Fig. 1 ∣
Extended Data Fig. 1 ∣. Potential sources of mixing in ChIP-DIP.
(a) Schematic of labeling strategy to generate Protein G beads coupled with a unique antibodyidentifying oligonucleotide and a matched antibody. (i) Protein G beads are covalently modified with a biotin, (ii) oligonucleotides containing a 3’ biotin are conjugated to streptavidin, (iii) oligo-streptavidin complexes are mixed with biotinylated protein G beads and (iv) protein G beads are mixed with antibodies. This process is repeated for each unique oligonucleotide-antibody pair and then all bead-antibody conjugates are pooled together. (b) Schematic of three potential sources of dissociation of chromatin-antibody-bead-oligo conjugates that could lead to mixing during ChIP-DIP: dissociation 1) between oligo and bead, 2) between antibody and bead, or 3) between antibody and chromatin. (c) If oligos dissociate from their original beads and bind to distinct beads (oligo-bead dissociation), we would expect multiple distinct oligo types on the same bead. To quantify this, we computed the percent uniqueness of oligo-types within each split-pool cluster. The cumulative distribution of the uniqueness of antibody-ID oligos type (x-axis) within individual clusters is shown. (d) If antibodies dissociate from their original bead and reassociate with a different bead (antibody-bead dissociation), we expect that chromatin would associate with empty beads present in the experiment. We show a schematic of the experimental design to test for antibody movement between beads (top) and the quantification of reads per bead assigned to true targets (CTCF) or empty beads added during experimental processing steps (bottom). (e) If proteins (and their crosslinked chromatin) dissociate and reassociate to other beads containing the same epitope-specific antibodies (antibody-chromatin dissociation), we would expect that chromatin purified independently from human and mouse lysates would mix during the procedure. We show a schematic of the human-mouse mixing experimental design to test for chromatin movement (left) and quantification of species-specific reads assigned to human or mouse beads (right).
Extended Data Fig.2 ∣
Extended Data Fig.2 ∣. Mapping multiple components of the same regulator complex within a single experiment.
(a) Visualization of various components of the PRC1 (RING1B, CBX8) and PRC2 (EZH2, SUZ12, EED) complexes that were mapped within the same ChIP-DIP pool (K562 52 Antibody Pool) along a genomic region (hg38, chr4:500,000-5,500,000).
Extended Data Fig. 3 ∣
Extended Data Fig. 3 ∣. Histone modifications associated with five chromatin states.
(a) UMAP embedding of 12 histone modifications measured in K562 correspond to five chromatin states. (b) Metaplot of signal distribution of H3K36me3, H3K79me1 and H3K79me2 across the gene body of protein coding genes in K562. (c) Correlation scatterplot of H3K9Ac and H3K4me3 signals at promoter sites in mESC. (d) Enrichment heatmap of H3K9me3 and H4K20me3 at various associated (ZNF genes, LTRs, LINES) and unassociated (SINES, TSS) genomic elements in K562. H3 is shown as reference. For A-D, see Methods for details on ChIP-DIP experiments used for each analysis.
Extended Data Fig. 4 ∣
Extended Data Fig. 4 ∣. Chromatin regulators co-localizing with known histone targets.
(a) Metaplots of read coverage for three H3K4me3-associated chromatin regulators (JARID1A, RBBP5, PHF8) and H3K4me3 at four promoter groups in mESC. Promoter groups were identified using k-means clustering of CR signal. (b) Metaplot showing colocalization of multiple PRC1 and PRC2 members and their respective histone modifications at RING1B sites in K562. (c) Genomewide correlation matrix of multiple HP1 proteins versus heterochromatin and euchromatin markers in K562. For A-C, see Methods for details on ChIP-DIP experiments used for each analysis.
Extended Data Fig. 5 ∣
Extended Data Fig. 5 ∣. Simultaneous mapping of distinct RNA polymerases and their isoforms.
(a) Bar graph showing enrichment of gene class coverage (rRNA, mRNA, snRNA or tRNA) for RNAP I, II and III in mESC. For each RNAP, the bar of its associated class (or classes) is highlighted. (b) Visualization of RNAP II phosphorylation isoforms across the NUP214 gene in K562 (left). Metaplot of signal distribution of RNAP II phosphorylation isoforms across the gene body of protein coding genes in K562 (right).
Extended Data Fig. 6 ∣
Extended Data Fig. 6 ∣. Chromatin dynamics and the relationship to gene expression following LPS stimulation in mDCs.
(a) Heatmap of change in normalized coverage per 100 kb bin for various mapped factors. For each factor, only enriched bins are shown and bins are sorted left-to-right by magnitude of change. (b) Violin plot of gene expression fold change for 6hrs vs 0hrs (left) and 24hrs vs 0hrs (right) grouped by sets of genes corresponding to sets of regions from Fig. 5C (see Methods). Shown are Mann-Whitney U test p-values. (c) Track visualization of H3K27ac at 0hrs, 6hrs and 24hrs across a genomic region (mm10, chr5:29,838,000-30,024,000) upstream of the inflammatory gene IL6 and containing regions belonging to the ‘activated’ set from Fig. 5B. (d) Heatmap of spearman correlation coefficients between histone coverage change and gene expression change between time points. Change is defined as the ratio between the two time points. All genes were included in the correlation heatmap on the left; only genes with a fold change of >2 in gene expression were included in the correlation heatmap on the right (see Methods and Supplemental Methods).
Extended Data Fig. 7 ∣
Extended Data Fig. 7 ∣. Transcription levels of specific clusters of H3K4me3 enriched regions.
(a) Violin plot of the transcriptional levels, measured by the RNAP II occupancy, of the five major clusters of H3K4me3 regions identified in Fig. 7.
Extended Data Fig. 8 ∣
Extended Data Fig. 8 ∣. Histone acetylation marks are highly correlated genome-wide.
(a) Genome-wide pearson correlation coefficients of 15 different histone acetylation marks in mESC. Correlations are based on coverage computed in 10 kb windows. (b) Comparison of 15 different histone acetylation marks across a genomic region (mm10, chr1:55,048,000-55,148,000) in mESC.
Extended Data Fig. 9 ∣
Extended Data Fig. 9 ∣. Enrichment profiles for NMF generated combinations (C1-C5) of histone acetylation marks.
(a) RNAP II, TF and CR enrichment matrix for regions assigned to combinations (C1-C5) from NMF decomposition of highly acetylated regions using histone acetylation marks, shown in Fig. 8. (b) Heatmap of genome position enrichments relative to TSS for regions assigned to combinations. (c) Transcription factors of top 10 most significant sequence motifs for regions assigned to each combination are listed.
Extended Data Fig. 10 ∣
Extended Data Fig. 10 ∣. Profiles for high density regions of NANOG-OCT4-SOX2.
(a) Plot showing normalized region scores (x-axis) for peak regions of NANOG-OCT4-SOX2, ordered by rank (y-axis). High density regions are defined as regions past the point where the slope = 1. (b) Track visualization of NANOG-OCT4-SOX2 upstream of the gene for the pluripotency transcription factor KLF4 in mESC. A high density region is indicated with a red bar; low density regions are indicated with grey bars. (c) Visualization of NANOG-OCT4-SOX2 near the TET2 gene, a developmentally associated chromatin regulator, in mESC. A high density region internal to the gene is indicated with a red bar. (d) Coverage metaplots over low density regions (LDR) vs high density regions (HDR) for pluripotency transcription factors and other transcriptional-related factors. Metagenes are centered on the region and the lengths represent the approximate difference in mean lengths (500 bps for LDRs and 14,500 bps for HDRs). An additional 4 kb surrounding each region is shown. (e) Enrichment heatmap for GO terms of genes associated with HDRs or LDRs containing C4, C5 or neither C4/C5 chromatin signatures. (f) Enrichment heatmap for development-associated GO terms of genes associated with HDRs or LDRs containing C4, C5 or neither C4/C5 chromatin signatures.
Fig. 1 ∣
Fig. 1 ∣. ChIP-DIP is a highly multiplexed method for mapping proteins to genomic DNA.
a, Schematic of the ChIP-DIP method. (1) Beads are coupled with an antibody and labeled with the associated oligonucleotide (oligo) tag (antibody ID). (2) Sets of antibody–bead–oligonucleotide conjugates are then mixed (antibody–bead pool) and used to perform ChIP. (3) Multiple rounds of split- and-pool barcoding are performed to identify molecules associated with each chromatin–antibody–bead–oligonucleotide conjugate. (4) DNA is sequenced, and genomic DNA and antibody (Ab)–oligonucleotide containing the same split- and-pool barcode are grouped into a cluster, which are used to assign genomic DNA regions to their linked antibodies. (5) All DNA reads from all clusters corresponding to the same antibody are used to generate protein localization maps. b, Protein localization maps over a specific human genomic region (hg38, chromosome (chr)12:53,649,999–54,650,000) for four protein targets: CTCF, H3K4me3, RNAP II and H3K27me3. Left, protein localization generated by ChIP-DIP in K562 cells. Top track shows read coverage before protein assignment, and the bottom four tracks correspond to read coverage after assignment to individual proteins. Right, ChIP–seq data generated by ENCODE in K562 cells for these same four proteins are shown for the same region. To enable direct comparison of scales between datasets, we normalized the scale to coverage per million aligned reads. Scale is shown from zero to maximum coverage within each region. c, Comparison of ChIP-DIP and ChIP–seq maps over specific regions corresponding to magnified views of the larger region shown in b. The locations presented are demarcated by colored bars above the gene track in b. Scale shown is like that in b. d, Genome-wide comparison (density plots of signal correlation) between the localization of each individual protein measured by ChIP-DIP (x axis) or ChIP–seq (y axis). Points are measured genome wide across 10-kb windows (CTCF, H3K27me3) or all promoter intervals (H3K4me3, RNAP II).
Fig. 2 ∣
Fig. 2 ∣. ChIP-DIP accurately maps known protein–DNA interactions across a range of multiplexed protein numbers, protein compositions and cell numbers.
a, Schematic of the experimental design to test the scalability of antibody–bead pool size and composition. b, Correlation heatmap for protein localization maps of 4 proteins (CTCF, H3K4me3, RNAP II and H3K27me3) generated using antibody pools of 5 different sizes (1, 10, 35, 50 and 52 antibodies per pool) and compositions. Correlations were calculated over the set of regions corresponding to the union of all peaks called for any of the four targets in the K562 ten-antibody experiment and were calculated using the background-corrected ChIP-DIP signal for each sample (Methods). Pool sizes are listed along the top and left axes. Replicate proteins in the same pool indicate that a different antibody was used for that protein. Some proteins were not included in every pool. c, Comparison of H3K4me3 localization over a specific genomic region (hg38, chr19:45,345,500–46,045,500) when measured within various antibody pool sizes and compositions. Scale is normalized to coverage per million aligned reads. d, Comparison of CTCF localization over a specific genomic region (hg38, chr19:40,349,999–41,050,000) when measured within a pool of 10 antibodies containing a single CTCF-targeting antibody (top) or within a pool of 52 antibodies containing 2 different CTCF-targeting antibodies (bottom). Scale is normalized to coverage per million aligned reads. e, Schematic of the experimental design to test the amount of cell input required for ChIP-DIP. k, thousand; M, million. f, Correlation heatmap for protein localization maps of four targets (CTCF, H3K4me3, RNAP II and H3K27me3) generated using various amounts of input cell lysate. Correlations were calculated over the same set of regions as b and using the background-corrected ChIP-DIP signal for each sample (Methods). Amounts of input cell lysate are listed along the top and left axes. g, Comparison of H3K4me3 localization over a specific genomic region (hg38, chr13:40,600,000–42,300,000) when measured using various amounts of input cell lysate. Scale is normalized to coverage per million aligned reads. h, Comparison of CTCF localization over a specific genomic region (hg38, chr12:53,664,000–53,764,000) when measured using various amounts of input cell lysate. Scale is normalized to coverage per million aligned reads.
Fig. 3 ∣
Fig. 3 ∣. ChIP-DIP accurately maps dozens of functionally diverse histone modifications and chromatin regulators.
a, Illustration of the diverse histone modifications and chromatin regulatory proteins mapped in K562 cells or mESCs using ChIP-DIP. b,c, Visualization of multiple histone modifications across a genomic region (hg38, chr22:23,050,000–23,290,000) in K562 cells corresponding to multiple histone modifications associated with enhancers (H3K4me1, H3K4me2 and H3K27ac) (b) and active gene bodies (H3K36me3, H3K79me1 and H3K79me2) (c). d, Top, schematic of histone modifications and chromatin regulators associated with active promoters. Bottom, visualization of multiple histone modifications associated with active promoters (H3K4me3 and H3K9ac) across a genomic region (mm10, chr12:81,590,000–81,636,000) in mESCs. Hash marks indicate an intervening 29-kb region that is not shown. e, Top, schematic of histone modifications and chromatin regulators associated with Polycomb-mediated repression. Bottom, visualization of multiple histone modifications associated with Polycomb-mediated repression (H3K27me3 and H2A119ub) across a genomic region (hg38, chr2:175,846,000–176,446,000) containing the silenced HOXD cluster in K562 cells. f, Top, schematic of histone modifications and chromatin regulators associated with constitutive heterochromatin. Bottom, visualization of multiple histone modifications associated with constitutive heterochromatin (H3K9me3 and H4K20me3) across a genomic region (hg38, chr2:46,200,000–55,700,000) in K562 cells. g, Visualization of an H3K4me3-associated eraser (JARID1A) and writer component (RBBP5) across the same genomic region as that in d. h, Visualization of PRC2 (EED) and PRC1 (RING1B) components across the same genomic region as that in e. i, Visualization of HP1β and HP1α across the same genomic region as that in f.
Fig. 4 ∣
Fig. 4 ∣. ChIP-DIP accurately maps dozens of TFs representing diverse functional classes and all three RNAPs.
a, Top, visualization of six TFs (SP1, USF2, p53-pSer15, NRF1, NANOG, RFX1) representing three broad functional classes (constitutive, stimulus response, development–cell type specific) across a genomic region (mm10, chr11:35,000,000–75,000,000) in mESCs. Bottom, higher-resolution magnified views showing individual TF binding patterns at selected targets and motif sites. (1) p53 binding the p53 response element on the cyclin G1 gene (Ccng1) promoter. (2) NANOG binding a cluster of sites internal to the developmental gene Adam19. (3) NRF1 binding multiple copies of its motif at the Fxr2 promoter. (4) The constitutively active USF2 binding its triplicate E-box motif. b, Visualization of the TF TBP (constitutive) and REST (NRSF; cell type specific) across a genomic region (hg38, chr11:1–11,000,000) in K562 cells. Bottom, higher-resolution magnified views highlight two individual peaks of REST at motif sites near promoters of known neuronal genes CHGB and SNAP25. c, De novo generated motifs for p53 (top) in mESCs and REST (bottom) in K562 cells using binding sites identified using ChIP-DIP. d, Visualization of RNAP I at the promoter and along the gene body of rDNA (left), RNAP II at an snRNA gene (middle) and RNAP III at a cluster of tRNA genes (right) in mESCs. ITS1, internal transcript spacer 1; IGS, intergenic spacer; ETS, external transcript spacer.
Fig. 5 ∣
Fig. 5 ∣. ChIP-DIP reveals dynamics changes in the chromatin landscape following LPS stimulation of primary mDCs.
a, Schematic of the experimental design to profile chromatin changes in primary cells following LPS stimulation. b, Visualization of H3K27ac, H3K9ac, H3K36ac and transcription levels at 0 h, 6 h and 24 h across a genomic region (mm10, chr2:129,298,000–129,420,000) containing the LPS-stimulated interleukin genes Il1a and Il1b. To enable direct comparison of time points, we normalized the scale to coverage per million aligned reads, and, for each target, scale is shown from zero to maximum coverage for all three time points. c, k-means clustered heatmap of H3K27ac coverage at individual enriched genomic regions (y axis) across time points (x axis). Three distinct sets of regions showing differential temporal patterns are labeled along the left side. Regions associated with example inflammatory genes are labeled on the right side. d, Line plots of relative H3K27ac coverage of regions from c (left) and expression of associated genes (right) versus time. Subsets of enhancer regions that are newly acetylated after stimulus (‘activated’) are shown above the dashed line, and subsets of enhancer regions that are deacetylated after stimulus (‘repressed’) are shown below the dashed line (Supplementary Methods). Mean levels are shown as solid lines with surrounding 95% confidence interval bands. e, Visualization of H3K27ac, H3K79me1 and transcription levels at 0 h, 6 h and 24 h across a genomic region (mm10, chr9:25,440,000–25,640,000) containing regions belonging to the ‘repressed’ set from c. A masked region of ~6 kb within the gene has been removed and is indicated by hash marks. f, Visualization of H3K27ac, H3K79me1 and transcription levels at 0 h, 6 h and 24 h across a genomic region (mm10, chr5:92,320,000–92,380,000) containing regions belonging to the ‘activated’ set from c. For e and f, scale per histone target is shown from zero to maximum coverage across both regions; scale for transcription is shown from zero to maximum coverage across a single region. Schematics showing relative quantification of levels across a region are shown on the right of each track.
Fig. 6 ∣
Fig. 6 ∣. Distinct chromatin signatures define the promoters of each RNAP.
a, Comparison of H3K4me3 and H3K27ac profiles at the promoters of RNAP I, II and III genes. The profile over RNAP I genes is displayed over the rDNA spacer promoter (left), while profiles over RNAP II and III genes are displayed as metaplots across active (blue) and inactive (dashed gray) promoters. Expr., expressed. b, Visualization of RNAP II and RNAP III along with the shared TF TBP and histone modifications H3K4me3 and H3K56ac across a genomic region (mm10, chr13:23,385,000–23,595,000) containing a tRNA gene cluster (RNAP III-transcribed genes) adjacent to a histone gene cluster (RNAP II-transcribed genes), separated by a dashed line. c, Density distribution of H3K4me2/H3K4me3 versus H3K56ac/H3K4me3 ratios at RNAP I, active RNAP II and active RNAP III promoters. Points show ratios when computed using the total sum of histone coverage over all respective promoters. Marginal distributions are shown for RNAP II and III along x and y axes. Axes are log10 scaled. This plot compares the relative signals of the same antibodies within the same sample across distinct genomic regions corresponding to known promoters of RNAP I, II and III genes. d, Schematic showing relative levels of histone modifications H3K4me2 and H3K56ac at H3K4me3-enriched regions and the relative position of the associated RNAP promoter.
Fig. 7 ∣
Fig. 7 ∣. Combinations of histone modifications distinguish RNAP II promoter type, activity and potential.
a, Hierarchically clustered heatmap of coverage levels of ten different histone modifications (y axis) at individual H3K4me3-enriched genomic regions (x axis). Five distinct clusters of regions are indicated by colored bars along the top axis. b, RNAP II coverage at H3K4me3-enriched regions, as sorted in a. c, Gene density of ten different gene classes at H3K4me3-enriched regions, as sorted in a. eRNA, enhancer RNA; lincRNA, long intergenic noncoding RNA. d, Visualization of H3K4me3 and H3K27me3–H2AK119ub (associated with cluster 1) across the EML5 gene in K562 cells. e, Visualization of H3K4me3 and H3K79me2–H3K79me3– H3K36me3 colocalization (associated with cluster 2) across the ribosomal protein gene RPL24 in K562 cells. f, Visualization of H3K4me3 and H4K20me3– H3K9me3 colocalization (associated with cluster 3) across neighboring ZNF genes ZNF69 and ZNF700 in K562 cells. g, Visualization of H3K4me3 and H3K4me1–H3K4me2–H3K27ac (associated with cluster 4) across the long intergenic noncoding RNA gene LNCRNA0881. For tracks in dg, the non-H3K4me3 tracks represent the sum of histone tracks associated with each set and are scaled to the maximum value across all panels. H3K4me3 tracks are scaled to the maximum for each panel. h, Schematic summarizing the co-occurring histone modifications at H3K4me3-enriched regions and their associated gene groups.
Fig. 8 ∣
Fig. 8 ∣. Distinct combinations of histone acetylation marks define unique enhancer types that differ in their activity and developmental potential.
a, The relative weights of five different combinations of histone acetylation marks (C1–C5, y axis) for each acetylated genomic region (x axis). Regions are grouped according to the combination that received the greatest weight, and groups are indicated along the top axis. b, The relative weights of each histone acetylation mark (y axis) within each combination (x axis). Only weights greater than 2.5 are labeled. c, Visualization of H3K9ac and H4ac along with SP1 and p53 across a genomic region (mm10, chr15:34,065,000–34,086,000) containing enhancers assigned to the C1 (yellow) and C3 (red) states. d, Visualization of H2BK20ac and H3K27ac along with NANOG, TEAD1 and RNAP II across two genomic regions (left, mm10, chr7:3,191,500–3,221,500; right, mm10, chr18:5,006,500–5,016,500) containing enhancers assigned to C4 (left) and to C5 (right), respectively (the scale of the NANOG track is capped to the maximum of the left region; TEAD1 data are from published ChIP–seq data from fetal cardiomyocytes). e, Visualization of H3K9ac, H2AZac and H4ac along with RING1B, p53 and RNAP II over a genomic region (mm10, chr8:47,272,800–47,427,000) containing multiple isoforms of the gene STOX2 and enhancers assigned to states C1–C4. f, DNA-associated proteins (x axis, ordered by function) with significant binding at genomic regions defined by each combination (y axis) are indicated in color (Methods). g, Bars show the enrichment value of selected transcription-associated factors or regions with a high density of pluripotency TFs (Supplementary Methods) in C4-versus C5-associated regions. Whiskers indicate the 5th and 95th percentiles from permutation-based resampling (n = 200 permutations) in which each permutation retained three-quarters of the C4 or C5 region. h, Schematic of C1–C5-associated regions and their corresponding functions.

Update of

References

    1. Bednar J et al. Nucleosomes, linker DNA, and linker histone form a unique structural motif that directs the higher-order folding and compaction of chromatin. Proc. Natl Acad. Sci. USA 95, 14173–14178 (1998). - PMC - PubMed
    1. Jenuwein T & Allis CD Translating the histone code. Science 293, 1074–1080 (2001). - PubMed
    1. Huang H, Sabari BR, Garcia BA, Allis CD & Zhao Y SnapShot: histone modifications. Cell 159, 458 (2014). - PMC - PubMed
    1. Tekel SJ & Haynes KA Molecular structures guide the engineering of chromatin. Nucleic Acids Res. 45, 7555–7570 (2017). - PMC - PubMed
    1. Mashtalir N et al. Chromatin landscape signals differentially dictate the activities of mSWI/SNF family complexes. Science 373, 306–315 (2021). - PMC - PubMed

MeSH terms