. 2010 Nov 30;5(11):e15092.

doi: 10.1371/journal.pone.0015092.

Efficient double fragmentation ChIP-seq provides nucleotide resolution protein-DNA binding profiles

Michal Mokry¹, Pantelis Hatzis, Ewart de Bruijn, Jan Koster, Rogier Versteeg, Jurian Schuijers, Marc van de Wetering, Victor Guryev, Hans Clevers, Edwin Cuppen

Affiliations

PMID: 21152096
PMCID: PMC2994895
DOI: 10.1371/journal.pone.0015092

Efficient double fragmentation ChIP-seq provides nucleotide resolution protein-DNA binding profiles

Michal Mokry et al. PLoS One. 2010.

. 2010 Nov 30;5(11):e15092.

doi: 10.1371/journal.pone.0015092.

Authors

Michal Mokry¹, Pantelis Hatzis, Ewart de Bruijn, Jan Koster, Rogier Versteeg, Jurian Schuijers, Marc van de Wetering, Victor Guryev, Hans Clevers, Edwin Cuppen

Affiliation

¹ Cancer Genomics Center, Department of Medical Genetics, Hubrecht Institute and University Medical Center Utrecht, Utrecht, The Netherlands.

PMID: 21152096
PMCID: PMC2994895
DOI: 10.1371/journal.pone.0015092

Abstract

Immunoprecipitated crosslinked protein-DNA fragments typically range in size from several hundred to several thousand base pairs, with a significant part of chromatin being much longer than the optimal length for next-generation sequencing (NGS) procedures. Because these larger fragments may be non-random and represent relevant biology that may otherwise be missed, but also because they represent a significant fraction of the immunoprecipitated material, we designed a double-fragmentation ChIP-seq procedure. After conventional crosslinking and immunoprecipitation, chromatin is de-crosslinked and sheared a second time to concentrate fragments in the optimal size range for NGS. Besides the benefits of increased chromatin yields, the procedure also eliminates a laborious size-selection step. We show that the double-fragmentation ChIP-seq approach allows for the generation of biologically relevant genome-wide protein-DNA binding profiles from sub-nanogram amounts of TCF7L2/TCF4, TBP and H3K4me3 immunoprecipitated material. Although optimized for the AB/SOLiD platform, the same approach may be applied to other platforms.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Double fragmentation ChIP-seq approach.**
A) Comparison of different shearing methods on crosslinked, de-crosslinked and native chromatin. Samples 1–3 represent crosslinked chromatin sheared at the same power intensity with increasing shearing times in 60 mm tubes, sample 4 is crosslinked chromatin sheared using AFA tubes (Covaris), sample 5 is crosslinked chromatin sheared using 60 mm tubes and subsequently sheared in AFA tubes, sample 6 is crosslinked chromatin sheared in 60 mm tubes, de-crosslinked and subsequently sheared in AFA tubes, samples 7 and 8 are samples of native chromatin sheared using 60 mm tubes and AFA tubes, respectively. Extensive shearing of crosslinked chromatin (e.g. sample 5) still leaves a significant proportion of chromatin fragments outside the optimal range for next-generation sequencing. However, this fraction can be sheared to smaller fragments after de-crosslinking (sample 6), but not without de-crosslinking (sample 5). B) Schematic overview of the double fragmentation ChIP-seq procedure. After normal immunoprecipitation, DNA is de-crosslinked, purified and additionally sheared to concentrate all fragments in the size range that is optimal for short tag sequencers like AB/SOLiD (100–300 nt) or Illumina/Solexa (400–600 nt). C) Overlap between TCF4 ChIP-chip and ChIP-seq data. Peak sets from libraries prepared with the double shearing approach show a larger overlap with the ChIP-chip peak data.

**Figure 2. Substructure of binding regions.**
A) Consensus binding motif sequence logo as identified by Cisgenome from the ChIP-seq data. B) Comparison of Tcf4 binding regions reconstructed from the reads mapped to both strands (blue), the negative strand (green), the positive strand (red), all sub-fragmented using sonication, and reads derived from a library sub-fragmented with DNaseI (brown). The shape and structure of the binding region is highly similar for both strands and does not depend on the fragmentation method used. C) Distribution of sequencing tags from positive and negative strands around the consensus TCF4 binding motif. In contrast to existing protocols without additional fragmentation the maxima of the peaks called separately from the positive and the negative strand overlap with only minor shifting.

**Figure 3. Characterization of TCF4 binding peaks.**
A) Number of peaks with at least one TCF4 binding motif in relation to the protein-DNA interaction strength. Peaks containing more reads (lower bin numbers) more often harbor a TCF4 binding motif compared to weaker ones. B) Conservation profile of experimentally identified TCF4 binding regions as well as all genomic regions containing the TCF4 consensus binding motif as compared to random regions. Experimentally identified binding regions were found to be more conserved than computationally predicted sites.

**Figure 4. Distribution of TCF4 binding peaks.**
The distribution of the TCF4 ChIP-seq peaks was analyzed with respect to the closest gene and compared to the distribution of random regions. Genome-wide distribution of ChIP-seq peaks is similar to those identified previously by ChIP-chip with peaks predominantly located far from annotated transcription start sites. This is in line with the established role of TCF4 as a transcriptional enhancer. Error bars for random regions represent standard deviation of 100 randomized datasets.

**Figure 5. Distribution of binding regions with respect to Wnt regulated genes.**
A) Gene expression rank analysis. Genes positively regulated by Wnt contain more peak forming sequencing tags within 100 kb from their transcription start sites B) Enrichment pattern of sequencing reads around TSS of up-, down-, and non-regulated genes. The observed pattern with additional maxima downstream and upstream of TSS could potentially be explained partially by the presence of alternative or non-annotated TSS, which is actually supported by the presence of CAGE tags in those regions.

See this image and copyright information in PMC

References

1. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316:1497–1502. - PubMed
1. Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007;4:651–657. - PubMed
1. Wederell ED, Bilenky M, Cullum R, Thiessen N, Dagpinar M, et al. Global analysis of in vivo Foxa2-binding sites in mouse adult liver using massively parallel sequencing. Nucleic Acids Res. 2008;36:4549–4564. - PMC - PubMed
1. Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, et al. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods. 2008;5:829–834. - PMC - PubMed
1. Jothi R, Cuddapah S, Barski A, Cui K, Zhao K. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 2008;36:5221–5231. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Efficient double fragmentation ChIP-seq provides nucleotide resolution protein-DNA binding profiles

Affiliation

Efficient double fragmentation ChIP-seq provides nucleotide resolution protein-DNA binding profiles

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Molecular Biology Databases