Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Feb;14(2):201-8.
doi: 10.1101/gr.1448004.

A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli

Affiliations

A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli

Martha L Bulyk et al. Genome Res. 2004 Feb.

Abstract

Various computational approaches have been developed for predicting cis-regulatory DNA elements in prokaryotic genomes. We describe a novel method for predicting transcription-factor-binding sites in Escherichia coli. Our method takes advantage of the principle that transcription factors frequently coregulate gene expression, but without requiring prior knowledge of which groups of genes are coregulated. Using position weight matrices for 49 known transcription factors, we examined spacings between pairs of matrix hits. These pairs were assigned probabilities according to the overrepresentation of their separation distance. The functions of many open reading frames (ORFs) downstream from predicted binding sites are unknown, and may correspond to novel regulon members. For five predictions, knockouts with mutated replacements of the predicted binding sites were created in E. coli MG1655. Quantitative real-time PCR (RT-PCR) indicates that for each of the knockouts, at least one gene immediately downstream exhibits a statistically significant change in mRNA expression. This approach may be useful in analyzing binding sites in a variety of organisms.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Summary of binding site knockouts. (Solid X) Predicted binding sites that were knocked out. (Dashed X) Knockouts of four predicted LexA-binding sites in the gorR–arsR IGR that were not successful, possibly because of lethality. Distances between genes are approximately to scale. In the uppermost construct, two predicted ArgR-binding sites (cross-hatched boxes) were knocked out in the aroP–pdhR IGR. In the construct shown below that one, only one of the two predicted GalR-binding sites (cross-hatched boxes) in the ppa–ytfQ IGR was knocked out; the predicted CRP site (stippled boxes) was not knocked out. The other constructs created successfully were knockout of three predicted PhoB sites (cross-hatched boxes) in the dinJ–yafL IGR; knockout of three predicted PhoB sites (cross-hatched boxes) in the yqeF–yqeG IGR; knockout of four predicted MetJ sites (cross-hatched boxes) in the ybdH–ybdL IGR. The predicted ArgR-binding sites are 241 bp upstream of aroP and 260 bp upstream of pdhR; the predicted PhoB sites are 103 bp upstream of dinJ and 73 bpupstream of yafL; the predicted GalR and CRP sites are 164 bp upstream of ppa and 129 bpupstream of ytfQ; the predicted MetJ sites are 18 bpupstream of ybdH and 58 bpupstream of ybdL; the predicted PhoB sites are 37 bpupstream of yqeF and 181 bpupstream of yqeG; the predicted LexA sites are 328 bp downstream of gor and 556 bpupstream of arsR (by “upstream,” here we mean the distance between the predicted binding sites and the start codon of the gene).
Figure 2
Figure 2
Design of binding site knockouts. Only the binding site substitution knockouts for ArgR, GalR, and PhoB are shown. The same strategy was followed in designing the MetJ- and LexA-binding site mutations (data not shown).
Figure 3
Figure 3
Primer extension assay of ArgR-binding site knockout in the aroP–pdhR intergenic region. The levels of pdhR transcript were measured, using the 23S-specific internal probe as an internal quantitation control for each RNA sample.

References

    1. Affymetrix, Inc. 2002. Affymetrix GeneChip Expression Analysis Technical Manual. Affymetrix, Inc., Santa Clara, CA.
    1. Bailey, T. and Elkan, C. 1995. The value of prior knowledge in discovering motifs with MEME. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3: 21-29. - PubMed
    1. Benos, P., Bulyk, M., and Stormo, G. 2002. Additivity in protein–DNA interactions: How good an approximation is it? Nucleic Acids Res. 30: 4442-4451. - PMC - PubMed
    1. Berg, O. and von Hippel, P. 1987. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J. Mol. Biol. 193: 723-750. - PubMed
    1. Berman, B.P., Nibu, Y., Pfeiffer, B.D., Tomancak, P., Celniker, S.E., Levine, M., Rubin, G.M., and Eisen, M.B. 2002. Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Proc. Natl. Acad. Sci. 99: 757-762. - PMC - PubMed

WEB SITE REFERENCES

    1. http://arep.med.harvard.edu/ecoli_matrices/spacing/spacing_predictions.html; Web site contains tab-delimited files containing predictions based on individual spacings, and separately based on spacing bins.
    1. http://arep.med.harvard.edu/labgc/pko3.html; Descriptions of the gene replacement vectors pKO3 and pKOV.
    1. http://twod.med.harvard.edu/labgc/estep/longPCR_protocol.html; Descriptions of the PCR conditions and protocols used in this project.

Publication types

MeSH terms

LinkOut - more resources