Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;16 Suppl 5(Suppl 5):S4.
doi: 10.1186/1471-2164-16-S5-S4. Epub 2015 May 26.

Direct ChIP-Seq significance analysis improves target prediction

Direct ChIP-Seq significance analysis improves target prediction

Mukesh Bansal et al. BMC Genomics. 2015.

Abstract

Background: Chromatin immunoprecipitation followed by sequencing of protein-bound DNA fragments (ChIP-Seq) is an effective high-throughput methodology for the identification of context specific DNA fragments that are bound by specific proteins in vivo. Despite significant progress in the bioinformatics analysis of this genome-scale data, a number of challenges remain as technology-dependent biases, including variable target accessibility and mappability, sequence-dependent variability, and non-specific binding affinity must be accounted for.

Results and discussion: We introduce a nonparametric method for scoring consensus regions of aligned immunoprecipitated DNA fragments when appropriate control experiments are available. Our method uses local models for null binding; these are necessary because binding prediction scores based on global models alone fail to properly account for specialized features of genomic regions and chance pull downs of specific DNA fragments, thus disproportionally rewarding some genomic regions and decreasing prediction accuracy. We make no assumptions about the structure or amplitude of bound peaks, yet we show that our method outperforms leading methods developed using either global or local null hypothesis models for random binding. We test prediction performance by comparing analyses of ChIP-seq, ChIP-chip, motif-based binding-site prediction, and shRNA assays, showing high reproducibility, binding-site enrichment in predicted target regions, and functional regulation of predicted targets.

Conclusions: Given appropriate controls, a direct nonparametric method for identifying transcription-factor targets from ChIP-Seq assays may lead to both higher sensitivity and higher specificity, and should be preferred or used in conjunction with methods that use parametric models for null binding.

PubMed Disclaimer

Figures

Figure 1
Figure 1
dIP significance estimates for bound genomic regions depend on both experiments (IP) and control. Significance is evaluated using the number of IP fragments (magnitude) after conditioning for the total number of fragments aligned to the region (amplitude) in both IP and control. (A) Minimum magnitudes for NANOG and SOX2 IP as a function of the amplitude to obtain FDR ≤ 0.1. (B) Minimum magnitude (mmin) for a fixed amplitude is the magnitude necessary for achieving statistical significance at a given FDR cutoff. It is calculated as that value of m at which the % of cumulative regions just crosses the FDR value. Here we present IP read count for regions with amplitude 20 in NANOG ChIP-Seq data.
Figure 2
Figure 2
Concordance between promoter occupancy predictions, and evidence for functional regulation and ChIP-chip predictions. (A) The count of predicted promoters bound by NOTCH1, NANOG and SOX2 from ChIP-Seq data is given as data labels, while the proportion of associated genes with evidence for functional regulation from RNAi studies is given on the y-axis. (B) The number of predicted targets for FOXA1 is given as data labels, y-axis reports on the proportion of target predictions lost after TF silencing. (C) Common target gene predictions from ChIP-chip and ChIP-Seq, where data labels give the absolute count of targets predicted from ChIP-Seq data, and the y axis gives the frequency that these predictions were verified by ChIP-chip. We plot data for promoters that are predicted by dIP and not MACS, MACS but not dIP, and both dIP and MACS. dIP predicted more target genes and its predictions agree better with ChIP-chip predictions. (D) Common gene target predictions from ChIP-chip and ChIP-Seq as a function of decreasing ChIP-Seq binding scores. (E) Jaccard's similarity coefficient was used to compare predicted ETS1-target promoters using 3 replicate IPs and 4 replicate IgG control assays, comparing the average similarity between predictions across IgG controls using the same IP (replicate IgG) or across IP assays with the same IgG control (replicate IP); error bars are given as S.E.M.
Figure 3
Figure 3
Comparison of binding site enrichment in predicted target regions. Frequency of motif-predicted binding sites for NOTCH1, NANOG and SOX2 in dIP and MACS predicted bound regions as a function of dIP and MACS scores; bound regions are identified genome wide and are not restricted to promoter regions.
Figure 4
Figure 4
Verification of promoter binding by PCR. Top predictions by both dIP and MACS were tested in two biological replicates, including common predictions, dIP-only predictions, and MACS-only predictions. As negative controls (grey) we tested predictions made uniquely by MACS with no control experiment input; error bars are given as S.E.M.

References

    1. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316(5830):1497–1502. doi: 10.1126/science.1141319. - DOI - PubMed
    1. Margolin AA, Palomero T, Sumazin P, Califano A, Ferrando AA, Stolovitzky G. ChIP-on-chip significance analysis reveals large-scale binding and regulation by human transcription factor oncogenes. Proceedings of the National Academy of Sciences. 2009;106(1):244–249. doi: 10.1073/pnas.0806445106. - DOI - PMC - PubMed
    1. Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129(4):823–837. doi: 10.1016/j.cell.2007.05.009. - DOI - PubMed
    1. Gilmour DS, Lis JT. In vivo interactions of RNA polymerase II with genes of Drosophila melanogaster. Mol Cell Biol. 1985;5(8):2009–2018. - PMC - PubMed
    1. Bartlett JM, Stirling D. A short history of the polymerase chain reaction. Methods Mol Biol. 2003;226:3–6. - PubMed

Publication types

MeSH terms