Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul 15;25(14):1715-21.
doi: 10.1093/bioinformatics/btp312. Epub 2009 May 14.

Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data

Affiliations

Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data

Hyungwon Choi et al. Bioinformatics. .

Abstract

Motivation: Chromatin immunoprecipitation (ChIP) experiments followed by array hybridization, or ChIP-chip, is a powerful approach for identifying transcription factor binding sites (TFBS) and has been widely used. Recently, massively parallel sequencing coupled with ChIP experiments (ChIP-seq) has been increasingly used as an alternative to ChIP-chip, offering cost-effective genome-wide coverage and resolution up to a single base pair. For many well-studied TFs, both ChIP-seq and ChIP-chip experiments have been applied and their data are publicly available. Previous analyses have revealed substantial technology-specific binding signals despite strong correlation between the two sets of results. Therefore, it is of interest to see whether the two data sources can be combined to enhance the detection of TFBS.

Results: In this work, hierarchical hidden Markov model (HHMM) is proposed for combining data from ChIP-seq and ChIP-chip. In HHMM, inference results from individual HMMs in ChIP-seq and ChIP-chip experiments are summarized by a higher level HMM. Simulation studies show the advantage of HHMM when data from both technologies co-exist. Analysis of two well-studied TFs, NRSF and CCCTC-binding factor (CTCF), also suggests that HHMM yields improved TFBS identification in comparison to analyses using individual data sources or a simple merger of the two.

Availability: Source code for the software ChIPmeta is freely available for download at http://www.umich.edu/~hwchoi/HHMMsoftware.zip, implemented in C and supported on linux.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
HHMM framework with the master process in the top layer and the multiple individual processes in the bottom layer. The hidden states in ChIP-seq and ChIP-chip data are considered as emission from the master process.
Fig. 2.
Fig. 2.
Plots of the ROC in the four simulation datasets, comparing ChIP-seq only, ChIP-chip only, HHMM, Intersection and Union. Four different settings of ChIP-seq and ChIP-chip data were generated. Signal present in 75% and 90% (A); 60% and 80% (B);75% and 75% (C); and 90% and 90% (D) of the ChIP-enriched regions detected by ChIP-seq and ChIP-chip, respectively.

Similar articles

Cited by

References

    1. Barski A, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837. - PubMed
    1. Bentley D, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. - PMC - PubMed
    1. Bui H, et al. Proceedings of AAAI. San Jose, CA: 2004. Hierarchical hidden Markov models with general state hierarchy.
    1. Cartharius K, et al. Matinspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics. 2005;21:2933–2942. - PubMed
    1. Chen X, et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell. 2008;133:1106–1117. - PubMed

Publication types