Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 May 11:5:57.
doi: 10.1186/1471-2105-5-57.

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

Affiliations

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

Davide Corà et al. BMC Bioinformatics. .

Abstract

Background: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance.

Results: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set.

Conclusions: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcription factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Location of the motifs belonging to the consensus GATGAGATGAGCT with respect to the translation and transcription start sites for 6 genes for which the latter is known. The binding sites are denoted by rectangles above or below the line depending on whether the consensus sequence is read on the Crick or Watson strand respectively. The vertical bar is the transcription start site, as given in Ref. [20].

References

    1. van Helden J, André B, Collado-Vides J. Extracting Regulatory Sites from the Upstream Region of Yeast Genes by Computational Analysis of Oligonucleotide Frequencies. J Mol Biol. 1998;281:827–842. doi: 10.1006/jmbi.1998.1947. - DOI - PubMed
    1. Caselle M, Di Cunto F, Provero P. Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes. BMC Bioinformatics. 2002;3:7. doi: 10.1186/1471-2105-3-7. - DOI - PMC - PubMed
    1. The Gene Ontology Consortium Gene Ontology: tool for the unification of biology. Nature Genetics. 2000;25:25–9. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Matys V, et al. TRANSFAC: transcriptional regulation from patterns to profiles. Nucleic Acids Research. 2003;31:374–8. doi: 10.1093/nar/gkg108. - DOI - PMC - PubMed
    1. Lee TI, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298:799–804. doi: 10.1126/science.1075090. - DOI - PubMed

MeSH terms

LinkOut - more resources