The cross-species prediction of bacterial promoters using a support vector machine
- PMID: 18703385
- DOI: 10.1016/j.compbiolchem.2008.07.009
The cross-species prediction of bacterial promoters using a support vector machine
Abstract
Due to degeneracy of the observed binding sites, the in silico prediction of bacterial sigma(70)-like promoters remains a challenging problem. A large number of sigma(70)-like promoters has been biologically identified in only two species, Escherichia coli and Bacillus subtilis. In this paper we investigate the issues that arise when searching for promoters in other species using an ensemble of SVM classifiers trained on E. coli promoters. DNA sequences are represented using a tagged mismatch string kernel. The major benefit of our approach is that it does not require a prior definition of the typical -35 and -10 hexamers. This gives the SVM classifiers the freedom to discover other features relevant to the prediction of promoters. We use our approach to predict sigma(A) promoters in B. subtilis and sigma(66) promoters in Chlamydia trachomatis. We extended the analysis to identify specific regulatory features of gene sets in C. trachomatis having different expression profiles. We found a strong -35 hexamer and TGN/-10 associated with a set of early expressed genes. Our analysis highlights the advantage of using TSS-PREDICT as a starting point for predicting promoters in species where few are known.
Similar articles
-
Improved prediction of bacterial transcription start sites.Bioinformatics. 2006 Jan 15;22(2):142-8. doi: 10.1093/bioinformatics/bti771. Epub 2005 Nov 15. Bioinformatics. 2006. PMID: 16287942
-
Phylogenetic comparison of the known Chlamydia trachomatis sigma(66) promoters across to Chlamydia pneumoniae and Chlamydia caviae identifies seven poorly conserved promoters.Res Microbiol. 2008 Sep-Oct;159(7-8):550-6. doi: 10.1016/j.resmic.2008.07.002. Epub 2008 Jul 23. Res Microbiol. 2008. PMID: 18708139
-
Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.J Mol Biol. 2000 Mar 24;297(2):335-53. doi: 10.1006/jmbi.2000.3576. J Mol Biol. 2000. PMID: 10715205
-
The complex architecture of mycobacterial promoters.Tuberculosis (Edinb). 2013 Jan;93(1):60-74. doi: 10.1016/j.tube.2012.08.003. Epub 2012 Sep 25. Tuberculosis (Edinb). 2013. PMID: 23017770 Review.
-
Direct methods for studying transcription regulatory proteins and RNA polymerase in bacteria.Curr Opin Microbiol. 2009 Oct;12(5):531-5. doi: 10.1016/j.mib.2009.08.006. Epub 2009 Sep 15. Curr Opin Microbiol. 2009. PMID: 19762273 Review.
Cited by
-
iPro70-FMWin: identifying Sigma70 promoters using multiple windowing and minimal features.Mol Genet Genomics. 2019 Feb;294(1):69-84. doi: 10.1007/s00438-018-1487-5. Epub 2018 Sep 5. Mol Genet Genomics. 2019. PMID: 30187132
-
An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data.BMC Genomics. 2015 May 7;16(1):359. doi: 10.1186/s12864-015-1555-8. BMC Genomics. 2015. PMID: 25947005 Free PMC article.
-
MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.Bioinformatics. 2019 Sep 1;35(17):2957-2965. doi: 10.1093/bioinformatics/btz016. Bioinformatics. 2019. PMID: 30649179 Free PMC article.
-
Image-based promoter prediction: a promoter prediction method based on evolutionarily generated patterns.Sci Rep. 2018 Dec 6;8(1):17695. doi: 10.1038/s41598-018-36308-0. Sci Rep. 2018. PMID: 30523308 Free PMC article.
-
Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction.Brief Bioinform. 2022 Mar 10;23(2):bbab551. doi: 10.1093/bib/bbab551. Brief Bioinform. 2022. PMID: 35021193 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous