Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
- PMID: 19061503
- PMCID: PMC2628906
- DOI: 10.1186/1471-2105-9-523
Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks
Abstract
Background: High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq). Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation.
Results: Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR). Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2-3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF) ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds.
Conclusion: The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/.
Figures




Similar articles
-
Using combined evidence from replicates to evaluate ChIP-seq peaks.Bioinformatics. 2015 Sep 1;31(17):2761-9. doi: 10.1093/bioinformatics/btv293. Epub 2015 May 7. Bioinformatics. 2015. PMID: 25957351
-
Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond.Cell Cycle. 2014;13(18):2847-52. doi: 10.4161/15384101.2014.949201. Cell Cycle. 2014. PMID: 25486472 Free PMC article. Review.
-
Cell-type specificity of ChIP-predicted transcription factor binding sites.BMC Genomics. 2012 Aug 3;13:372. doi: 10.1186/1471-2164-13-372. BMC Genomics. 2012. PMID: 22863112 Free PMC article.
-
Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling.Nucleic Acids Res. 2017 Dec 1;45(21):e173. doi: 10.1093/nar/gkx799. Nucleic Acids Res. 2017. PMID: 28981893 Free PMC article.
-
Chop it, ChIP it, check it: the current status of chromatin immunoprecipitation.Front Biosci. 2008 Jan 1;13:929-43. doi: 10.2741/2733. Front Biosci. 2008. PMID: 17981601 Review.
Cited by
-
Normalization of ChIP-seq data with control.BMC Bioinformatics. 2012 Aug 10;13:199. doi: 10.1186/1471-2105-13-199. BMC Bioinformatics. 2012. PMID: 22883957 Free PMC article.
-
Identifying ChIP-seq enrichment using MACS.Nat Protoc. 2012 Sep;7(9):1728-40. doi: 10.1038/nprot.2012.101. Epub 2012 Aug 30. Nat Protoc. 2012. PMID: 22936215 Free PMC article.
-
ChIP-seq: advantages and challenges of a maturing technology.Nat Rev Genet. 2009 Oct;10(10):669-80. doi: 10.1038/nrg2641. Epub 2009 Sep 8. Nat Rev Genet. 2009. PMID: 19736561 Free PMC article. Review.
-
In vivo determination of direct targets of the nonsense-mediated decay pathway in Drosophila.G3 (Bethesda). 2014 Mar 20;4(3):485-96. doi: 10.1534/g3.113.009357. G3 (Bethesda). 2014. PMID: 24429422 Free PMC article.
-
A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.Biol Direct. 2014 Feb 20;9:4. doi: 10.1186/1745-6150-9-4. Biol Direct. 2014. PMID: 24555784 Free PMC article. Review.
References
-
- Ng P, Wei CL, Ruan Y. Paired-end diTagging for transcriptome and genome analysis. Curr Protoc Mol Biol. 2007;Chapter 21 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous