Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Apr 21:7:218.
doi: 10.1186/1471-2105-7-218.

XcisClique: analysis of regulatory bicliques

Affiliations

XcisClique: analysis of regulatory bicliques

Amrita Pati et al. BMC Bioinformatics. .

Abstract

Background: Modeling of cis-elements or regulatory motifs in promoter (upstream) regions of genes is a challenging computational problem. In this work, set of regulatory motifs simultaneously present in the promoters of a set of genes is modeled as a biclique in a suitably defined bipartite graph. A biologically meaningful co-occurrence of multiple cis-elements in a gene promoter is assessed by the combined analysis of genomic and gene expression data. Greater statistical significance is associated with a set of genes that shares a common set of regulatory motifs, while simultaneously exhibiting highly correlated gene expression under given experimental conditions.

Methods: XcisClique, the system developed in this work, is a comprehensive infrastructure that associates annotated genome and gene expression data, models known cis-elements as regular expressions, identifies maximal bicliques in a bipartite gene-motif graph; and ranks bicliques based on their computed statistical significance. Significance is a function of the probability of occurrence of those motifs in a biclique (a hypergeometric distribution), and on the new sum of absolute values statistic (SAV) that uses Spearman correlations of gene expression vectors. SAV is a statistic well-suited for this purpose as described in the discussion.

Results: XcisClique identifies new motif and gene combinations that might indicate as yet unidentified involvement of sets of genes in biological functions and processes. It currently supports Arabidopsis thaliana and can be adapted to other organisms, assuming the existence of annotated genomic sequences, suitable gene expression data, and identified regulatory motifs. A subset of Xcis Clique functionalities, including the motif visualization component MotifSee, source code, and supplementary material are available at https://bioinformatics.cs.vt.edu/xcisclique/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Graphical representation of a Biclique. The vertices of a biclique can be partitioned into two sets S and T such that no two vertices within a set are adjacent and every vertex in Sis connected to every vertex in T and vice-versa. In this case, S is the set of genes and T is the set of motifs.
Figure 2
Figure 2
XcisClique Schematic. Contents of thickly outlined boxes indicate processing. Contents of thinly outlined boxes describe data. 1. Input geneset G, 2. Input set of patterns P, 3. Input set of treatments T, 4. Find matches of P on promoters of G to get occurrence graph O MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFoe=taaa@383D@, 5. Feed O MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFoe=taaa@383D@ to Apriori to identify bicliques, 6. Evaluate bicliques with respect to statistical over-representation of patterns to get MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFlecsaaa@3763@tail, 7. Evaluate bicliques over T to get FSAV, 8. Combine MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFlecsaaa@3763@tail and FSAV for each biclique and rank bicliques, 9. Visualize arrangement of cis-elements on promoters, biclique-wise, 10. Visualize gene expression vectors biclique-wise.
Figure 3
Figure 3
Illustration of a Biclique. Biclique_4_4 is an example of a biclique. Genes Gl, G2, G3, and G4 share regulatory motifs HSE, STRE, C/EBP, and UPRMOTIF on their promoters.
Figure 4
Figure 4
Selected significant motif combinations for Case Study 1. Edges in a biclique are represented by boxes of a particular color. The presence of a biclique box in a motif row x and gene column y indicates the presence of motif x in the promoter of gene y . For e.g., all red boxes represent edges in Biclique 111.

References

    1. Fickett JW, Hatzigeorgiou AG. Eukaryotic Promoter Recognition. Genome Research. 1997;7:861–878. - PubMed
    1. Terai G, Takagi T. Predicting rules on organization of cis-regulatory elements, taking the order of elements into account. Bioinformatics. 2004;20:1119–1128. - PubMed
    1. Werner T. Models for prediction and recognition of eukaryotic promoters. Mammalian Genome, Incorporating Mouse Genome. 1999;10:168–175. - PubMed
    1. Pilpel Y, Sudarsanam P, Church GM. Identifying regulatory networks by combinatorial analysis of promoter elements. Nature Genetics. 2001 - PubMed
    1. Shen Q, Ho THD. Functional Dissection of an Abscisic Acid (ABA)-Inducible Gene Reveals Two Independent ABA-Responsive Complexes Each Containing a G-Box and a Novel cis-Acting Element. The Plant Cell. 1995;7:295–307. - PMC - PubMed

Publication types

MeSH terms

Substances