. 2009 Oct 21:3:129-40.

doi: 10.4137/bbi.s3445.

Inferring Transcriptional Interactions by the Optimal Integration of ChIP-chip and Knock-out Data

Haoyu Cheng¹, Lihua Jiang, Maoying Wu, Qi Liu

Affiliations

PMID: 20140075
PMCID: PMC2808186
DOI: 10.4137/bbi.s3445

Inferring Transcriptional Interactions by the Optimal Integration of ChIP-chip and Knock-out Data

Haoyu Cheng et al. Bioinform Biol Insights. 2009.

. 2009 Oct 21:3:129-40.

doi: 10.4137/bbi.s3445.

Authors

Haoyu Cheng¹, Lihua Jiang, Maoying Wu, Qi Liu

Affiliation

¹ School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China. Email: liuqi@sjtu.edu.cn.

PMID: 20140075
PMCID: PMC2808186
DOI: 10.4137/bbi.s3445

Abstract

How to combine heterogeneous data sources for reliable prediction of transcriptional regulation is a challenge. Here we present an easy but powerful method to integrate Chromatin immunoprecipitation (ChIP)-chip and knock-out data. Since these two types of data provide complementary (physical and functional) information about transcription, the method combining them is expected to achieve high detection rates and very low false positive rates. We try to seek the optimal integration of these two data using hyper-geometric distribution. We evaluate our method on yeast data and compare our predictions with YEASTRACT, high-quality ChIP-chip data, and literature. The results show that even using low-quality ChIP-chip data, our method uncovers more relations than those inferred before from high-quality data. Furthermore our method achieves a low false positive rate. We find experimental and computational evidence in literature for most transcription factor (TF)-gene relations uncovered by our method.

Keywords: ChIP; P-value threshold; cooperativity; hypergeometric distribution; knock-out data; regulatory interaction; transcription factor.

PubMed Disclaimer

Figures

**Figure 1.**
Schematic diagram of the method. The starting point for this method depends on ChIP binding data and TF knockout data (the data sources showed on the left). For each TF, two thresholds are selected for the ChIP binding data and TF deletion data, respectively. When the binding P value of a single gene is less than the binding threshold, this gene is considered to be the binding target. Similarly, if the effectual P value of a single gene in a deletion experiment is less than its assigned threshold, then this gene is defined as the affected target. Both of the two thresholds are set in the range from 0.001 to 0.05 with an increment of 0.001. A value called overlapping significance is calculated based on the binding target set, the affected target set and the intersection of them (the intersecting ovals in the middle). This process is reiterated for all possible combinations of thresholds so that the maximal overlapping significance is obtained (procedures and formulas are showed on the right).

**Figure 2.**
Comparison with YEASTARCT. For the 30 TFs, number of the target genes identified with the stringent P-value threshold pair (Pb_t = 0.001, Pe_t = 0.001) (blue), number of the target genes inferred with the optimal threshold pair (Pb*_t, Pe*_t) by our method (green), and the number of our predictions supported in YEASTARCT are shown (red).

**Figure 3.**
Comparison with high-quality ChIP-chip data. Oval nodes are for genes identified with stringent P-value cutoffs (Pb_t = 0.001, Pe_t = 0.001), while rectangular nodes are for additional genes identified using optimal relaxed threshold pair by our method. Nodes with red solid border are for relations supported by YEASTRACT, otherwise with black dash border. Solid nodes are for the genes supported by high-quality ChIP-chip data. A) 126 identified target genes of RAP1. 56 additional target genes are identified (rectangular), while 51 (rectangular with red solid border) are supported by YEASTRACT and 34 (solid rectangular) are supported by high-quality ChIP-chip data. B) We have identified 48 target genes of SWI4 including SWI4 itself. SWI4-SWI4 self-regulation is showed by the arrow pointed back to SWI4 itself in the figure. Among SWI4 and other 37 additional target genes identified using optimal relaxed threshold pair by our method (rectangular), as many as 28 (rectangular with red solid border) and SWI4 are supported by YEASTRACT; 16 (solid rectangular) and SWI4 are supported by high-quality ChIP-chip data.

See this image and copyright information in PMC

Cited by

Noise regularization removes correlation artifacts in single-cell RNA-seq data preprocessing.
Zhang R, Atwal GS, Lim WK. Zhang R, et al. Patterns (N Y). 2021 Feb 15;2(3):100211. doi: 10.1016/j.patter.2021.100211. eCollection 2021 Mar 12. Patterns (N Y). 2021. PMID: 33748795 Free PMC article.
Single-cell transcriptomics unveils gene regulatory network plasticity.
Iacono G, Massoni-Badosa R, Heyn H. Iacono G, et al. Genome Biol. 2019 Jun 4;20(1):110. doi: 10.1186/s13059-019-1713-4. Genome Biol. 2019. PMID: 31159854 Free PMC article.

References

1. Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J Comput Biol. 2000;7:3–4. 601–20. - PubMed
1. Ideker TE, Thorsson V, Karp RM. Discovery of regulatory interactions through perturbation: inference and experimental design. Pac Symp Biocomput. 2000:305–16. - PubMed
1. Birnbaum K, Benfey PN, Shasha DE. cis element/transcription factor analysis (cis/TF): a method for discovering transcription factor/cis element relationships. Genome Res. 2001 Sep;11(9):1567–73. - PMC - PubMed
1. Zhu Z, Pilpel Y, Church GM. Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. J Mol Biol. 2002 Apr 19;318(1):71–81. - PubMed
1. Imoto S, Goto T, Miyano S. Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pac Symp Biocomput. 2002:175–86. - PubMed

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- Saccharomyces Genome Database
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Inferring Transcriptional Interactions by the Optimal Integration of ChIP-chip and Knock-out Data

Affiliation

Inferring Transcriptional Interactions by the Optimal Integration of ChIP-chip and Knock-out Data

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous