Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006;34(20):5943-50.
doi: 10.1093/nar/gkl608. Epub 2006 Oct 26.

Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction

Affiliations

Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction

Uwe Ohler. Nucleic Acids Res. 2006.

Abstract

The reliable recognition of eukaryotic RNA polymerase II core promoters, and the associated transcription start sites (TSSs) of genes, has been an ongoing challenge for computational biology. High throughput experimental methods such as tiling arrays or 5' SAGE/EST sequencing have recently lead to much larger datasets of core promoters, and to the assessment that the well-known core promoter sequence elements such as the TATA box appear to be much less frequent than thought. Here, we address the co-occurrence of several previously identified core promoter sequence motifs in Drosophila melanogaster to determine frequently occurring core promoter modules. We then use this in a new strategy to model core promoters as a set of alternative submodels for different core promoter architectures reflecting these different motif modules. We show that this system improves greatly on computational promoter recognition and leads to highly accurate in silico TSS prediction. Our results indicate that at least for the case of the fruit fly, we are getting closer to an understanding of how the beginning of a gene is defined in a eukaryotic genome.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A new Drosophila core promoter module. (A and B) show the location distributions of motifs 6 and 1 relative to the TSS (pos. 0), and (C) shows the distance between motif occurrences in the same promoter.
Figure 2
Figure 2
Comparison of motif module frequency in the initial and the final partitioning after semi-supervised clustering. Only initial frequencies are shown for the partition of sequences without a strong motif hit (‘no initial’), i.e. which were initially not assigned to a particular motif class, and for the MTE motif partition, which proved to be not stable and was gradually split up among the other classes. For each of the final partitions, we show the number of promoters with the same motif/module, i.e. which are left from the initial partitions (blue); the number of promoters which were initially assigned to a different partition among the five stable subclasses (red); and the number of promoters from the initial ‘no motif’ and MTE partitions (yellow). Promoters were assigned to several initial partitions in case several motifs/modules had a good hit, and the combined size of the initial partitions thus adds up to more than the total dataset of 1864 promoters.
Figure 3
Figure 3
Specific sequence profiles (left) in five different subclasses of core promoters (right). The left shows the average GC trinucleotide content in the region [−250, +50]. The right depicts the different core modules currently modeled in the McPromoter system.

Similar articles

Cited by

References

    1. Li H., Wang W. Dissecting the transcription networks of a cell using computational genomics. Curr. Opin. Genet. Dev. 2003;13:611–616. - PubMed
    1. Levine M., Tjian R. Transcription regulation and animal diversity. Nature. 2003;424:147–151. - PubMed
    1. Arnosti D.N. Analysis and function of transcriptional regulatory elements: insights from Drosophila. Annu. Rev. Entomol. 2003;48:579–602. - PubMed
    1. Wray G.A., Hahn M.W., Abouheif E., Balhoff J.P., Pizer M., Rockman M.V., Romano L.A. The evolution of transcriptional regulation in eukaryotes. Mol. Biol. Evol. 2003;20:1377–1419. - PubMed
    1. Smale S.T., Kadonaga J.T. The RNA polymerase II core promoter. Annu. Rev. Biochem. 2003;72:449–479. - PubMed

Publication types