Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 1:16:400.
doi: 10.1186/s12859-015-0827-2.

PC-TraFF: identification of potentially collaborating transcription factors using pointwise mutual information

Affiliations

PC-TraFF: identification of potentially collaborating transcription factors using pointwise mutual information

Cornelia Meckbach et al. BMC Bioinformatics. .

Abstract

Background: Transcription factors (TFs) are important regulatory proteins that govern transcriptional regulation. Today, it is known that in higher organisms different TFs have to cooperate rather than acting individually in order to control complex genetic programs. The identification of these interactions is an important challenge for understanding the molecular mechanisms of regulating biological processes. In this study, we present a new method based on pointwise mutual information, PC-TraFF, which considers the genome as a document, the sequences as sentences, and TF binding sites (TFBSs) as words to identify interacting TFs in a set of sequences.

Results: To demonstrate the effectiveness of PC-TraFF, we performed a genome-wide analysis and a breast cancer-associated sequence set analysis for protein coding and miRNA genes. Our results show that in any of these sequence sets, PC-TraFF is able to identify important interacting TF pairs, for most of which we found support by previously published experimental results. Further, we made a pairwise comparison between PC-TraFF and three conventional methods. The outcome of this comparison study strongly suggests that all these methods focus on different important aspects of interaction between TFs and thus the pairwise overlap between any of them is only marginal.

Conclusions: In this study, adopting the idea from the field of linguistics in the field of bioinformatics, we develop a new information theoretic method, PC-TraFF, for the identification of potentially collaborating transcription factors based on the idiosyncrasy of their binding site distributions on the genome. The results of our study show that PC-TraFF can succesfully identify known interacting TF pairs and thus its currently biologically uncorfirmed predictions could provide new hypotheses for further experimental validation. Additionally, the comparison of the results of PC-TraFF with the results of previous methods demonstrates that different methods with their specific scopes can perfectly supplement each other. Overall, our analyses indicate that PC-TraFF is a time-efficient method where its algorithm has a tractable computational time and memory consumption. The PC-TraFF server is freely accessible at http://pctraff.bioinf.med.uni-goettingen.de/.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
PC-TraFF significant collaborating TFBS pairs based on promoter sequences of human RefSeq genes. Blue lines denote interactions between TFs whose importance is experimentally verified whereas red lines indicate potential interactions between transcription factors that have not been experimentally validated yet
Fig. 2
Fig. 2
PC-TraFF significant collaborating TFBS pairs based on promoter sequences of human miRNA genes. Blue lines denote interactions between TFs whose importance is experimentally verified whereas red lines indicate potential interactions between transcription factors that have not been experimentally validated yet
Fig. 3
Fig. 3
PC-TraFF significant collaborating TFBS pairs based on breast cancer-associated promoter sequences of human RefSeq genes. Blue lines denote interactions between TFs whose importance is experimentally verified whereas red lines indicate potential interactions between transcription factors that have not been experimentally validated yet. The binding sites V$NFKB_Q6, V$CETS1P54_01, and V$MYCMAX_B constitute three hubs in the predicted collaboration network of significant TFBS pairs. The hubs and their top three collaboration partners are given in Table 9
Fig. 4
Fig. 4
PC-TraFF significant collaborating TFBS pairs based on breast cancer-associated promoter sequences of human miRNA genes. Blue lines denote interactions between TFs whose importance is experimentally verified whereas red lines indicate potential interactions between transcription factors, that have not been experimentally validated yet
Fig. 5
Fig. 5
Filtering procedure of the overlap filter. Overlapping TFBSs of the same type (marked in red cycles) are filtered in a way that the TFBS survives which is closer to TSS
Fig. 6
Fig. 6
The problem of homotypic clusters: The TFBSs (t blue) form an homotypic cluster within a certain interval on the sequence. The TFBS t red is also included in this interval. According to our definition to construct TFBS pairs and by following the DNA strand in 5’-3’ direction: i) we consider one t bluet red pair in this interval indicating that an individual TFBS can only participate in one count of a specified pair (shown with black line); ii) if we consider t bluet blue pairs, there are two pairs within this interval (shown with blue lines). The red (dashed) lines demonstrate that the remaining t bluet blue and t bluet red pairs are not taken into account in the calculation of pointwise mutual information of this pairs

References

    1. Amoutzias GD, Robertson DL, de Peer YV, Oliver SG. Choose your partners: dimerization in eukaryotic transcription factors. Trends Biochem Sci. 2008;33(5):220–9. doi: 10.1016/j.tibs.2008.02.002. - DOI - PubMed
    1. Zhu Z, Shendure J, Church GM. Discovering functional transcription-factor combinations in the human cell cycle. Genome Res. 2005;15(6):848–55. doi: 10.1101/gr.3394405. - DOI - PMC - PubMed
    1. Mysickova A, Vingron M. Detection of interacting transcription factors in human tissues using predicted DNA binding affinity. BMC Genomics. 2012;13(Suppl 1):S2. doi: 10.1186/1471-2164-13-S1-S2. - DOI - PMC - PubMed
    1. Navarro C, Lopez FJ, Cano C, Garcia-Alcalde F, Blanco A. CisMiner: Genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining. PLoS ONE. 2014;9(9):e108065. doi: 10.1371/journal.pone.0108065. - DOI - PMC - PubMed
    1. Jankowski A, Prabhakar S, Tiuryn J. TACO: a general-purpose tool for predicting cell-type-specific transcription factor dimers. BMC Genomics. 2014;15:208. doi: 10.1186/1471-2164-15-208. - DOI - PMC - PubMed