Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug;12(8):1659-1672.
doi: 10.1038/nprot.2017.055. Epub 2017 Jul 20.

Mapping genome-wide transcription-factor binding sites using DAP-seq

Affiliations

Mapping genome-wide transcription-factor binding sites using DAP-seq

Anna Bartlett et al. Nat Protoc. 2017 Aug.

Abstract

To enable low-cost, high-throughput generation of cistrome and epicistrome maps for any organism, we developed DNA affinity purification sequencing (DAP-seq), a transcription factor (TF)-binding site (TFBS) discovery assay that couples affinity-purified TFs with next-generation sequencing of a genomic DNA library. The method is fast, inexpensive, and more easily scaled than chromatin immunoprecipitation sequencing (ChIP-seq). DNA libraries are constructed using native genomic DNA from any source of interest, preserving cell- and tissue-specific chemical modifications that are known to affect TF binding (such as DNA methylation) and providing increased specificity as compared with in silico predictions based on motifs from methods such as protein-binding microarrays (PBMs) and systematic evolution of ligands by exponential enrichment (SELEX). The resulting DNA library is incubated with an affinity-tagged in vitro-expressed TF, and TF-DNA complexes are purified using magnetic separation of the affinity tag. Bound genomic DNA is eluted from the TF and sequenced using next-generation sequencing. Sequence reads are mapped to a reference genome, identifying genome-wide binding locations for each TF assayed, from which sequence motifs can then be derived. A researcher with molecular biology experience should be able to follow this protocol, processing up to 400 samples per week.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. DAP-seq protocol overview
(a) An adapter-ligated DNA library is prepared by shearing native genomic DNA into ~200bp fragments and ligating Illumina-based sequencing adapters onto the repaired ends. (b) Transcription factor (TF) ORF clones fused to the Halo affinity tag are expressed in vitro and bound to ligand-coupled beads, while non-specific proteins are washed away. (c) HaloTag-TF fusion proteins are incubated with adapter-ligated genomic DNA library, and unbound DNA fragments are washed away. Samples are heated to release TF-bound DNA and the recovered DNA is PCR amplified to attach indexed sequencing primers. Indexed DNA samples are subsequently combined and size selected to remove residual adapter dimers. Purified DNA libraries are sequenced using next-generation sequencing and the resulting genome-wide binding events are analyzed. Peaks shown are maize DAP-seq peaks viewed in the Integrative Genomics Viewer.
Figure 2
Figure 2. DAP-seq DNA library titration experiment with Arabidopsis transcription factor TGA5 (AT5G06960)
(a) Number of mapped reads increases with input library amount. (b) Number of peaks called is highest at 50 ng of input then decreases, likely due to increase in background noise, consistent with the reduction in fraction of reads in peaks (c). (d) Coverage histogram of the number of positions in the genome (y-axis) covered at each sequencing depth (x-axis) shows that input of 10, 50, 100 ng leads to enrichment (i.e., high sequencing depth) at a subset of base positions. Strand shift cross-coverage score (e) and mean read pileup at peaks (f), both indicators of the extent of enrichment at specific genomic locations, are much higher for 50 and 100 ng input. (g) The sequence motifs identified from the top 600 peaks of all experiments are very similar. Similarities between pairs of experiments, both quantitative (h – Pearson correlation of reads in peaks) and qualitative (i – Jaccard index of peak regions) show that experiments of 10, 50, and 100 ng identify similar genome-wide binding profiles. DNA library was prepared from A. thaliana ecotype Col-0. Following the DAP-seq protocol with the TF TGA5, the resulting libraries were sequenced on an Illumina HiSeq 4000 instrument with 100-bp paired-end reads. Reads were mapped using bowtie2 version 2.2.9 against the TAIR10 reference with default parameters. Reads aligned to nuclear chromosomes with MAPQ score ≥30 were used to call peaks by the GEM peak caller using only first read in the pair with parameters “--k_min 6 --k_max 20 --k_seqs 600 --outNP --outMEME --outJASPAR --k_neg_dinu_shuffle --t 11”. The top 600 peaks, ranked first by enrichment q-value then by fold enrichment, were used for de novo motif discovery by MEME-ChIP version 4.11.2. Fractions of reads in peaks, coverage histograms, strand shift cross-coverage scores, mean read pileup at peaks, and Pairwise Pearson Correlations were computed by the R package ChIPQC version 1.10.2. Motif logos were drawn by the R package motifStack version 1.16.2. Pairwise Jaccard Index were computed by BEDTools version 2.24.0.

References

    1. Swinnen G, Goossens A, Pauwels L. Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement. Trends in Plant Science. 2016 doi: 10.1016/j.tplants.2016.01.014. - DOI - PubMed
    1. Deplancke B, Alpern D, Gardeux V. The Genetics of Transcription Factor DNA Binding Variation. Cell. 2016 doi: 10.1016/j.cell.2016.07.012. - DOI - PubMed
    1. Babu MM, Luscombe NM, 3, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr. Opin. Struct. Biol. 2004;14:283–291. - PubMed
    1. Niu W, et al. Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. Genome Res. 2011;21:245–254. - PMC - PubMed
    1. Negre N, et al. A cis-regulatory map of the Drosophila genome. Nature. 2011;471:527–531. - PMC - PubMed

MeSH terms