Improved predictions of transcription factor binding sites using physicochemical features of DNA
- PMID: 22923524
- PMCID: PMC3526315
- DOI: 10.1093/nar/gks771
Improved predictions of transcription factor binding sites using physicochemical features of DNA
Abstract
Typical approaches for predicting transcription factor binding sites (TFBSs) involve use of a position-specific weight matrix (PWM) to statistically characterize the sequences of the known sites. Recently, an alternative physicochemical approach, called SiteSleuth, was proposed. In this approach, a linear support vector machine (SVM) classifier is trained to distinguish TFBSs from background sequences based on local chemical and structural features of DNA. SiteSleuth appears to generally perform better than PWM-based methods. Here, we improve the SiteSleuth approach by considering both new physicochemical features and algorithmic modifications. New features are derived from Gibbs energies of amino acid-DNA interactions and hydroxyl radical cleavage profiles of DNA. Algorithmic modifications consist of inclusion of a feature selection step, use of a nonlinear kernel in the SVM classifier, and use of a consensus-based post-processing step for predictions. We also considered SVM classification based on letter features alone to distinguish performance gains from use of SVM-based models versus use of physicochemical features. The accuracy of each of the variant methods considered was assessed by cross validation using data available in the RegulonDB database for 54 Escherichia coli TFs, as well as by experimental validation using published ChIP-chip data available for Fis and Lrp.
Figures





Similar articles
-
Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.PLoS Comput Biol. 2010 Nov 18;6(11):e1001007. doi: 10.1371/journal.pcbi.1001007. PLoS Comput Biol. 2010. PMID: 21124945 Free PMC article.
-
A balancing act in transcription regulation by response regulators: titration of transcription factor activity by decoy DNA binding sites.Nucleic Acids Res. 2021 Nov 18;49(20):11537-11549. doi: 10.1093/nar/gkab935. Nucleic Acids Res. 2021. PMID: 34669947 Free PMC article.
-
MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites.BMC Bioinformatics. 2019 May 1;20(Suppl 7):200. doi: 10.1186/s12859-019-2735-3. BMC Bioinformatics. 2019. PMID: 31074373 Free PMC article.
-
Global regulators of transcription in Escherichia coli: mechanisms of action and methods for study.Adv Appl Microbiol. 2008;65:93-113. doi: 10.1016/S0065-2164(08)00604-7. Adv Appl Microbiol. 2008. PMID: 19026863 Review. No abstract available.
-
An algorithmic perspective of de novo cis-regulatory motif finding based on ChIP-seq data.Brief Bioinform. 2018 Sep 28;19(5):1069-1081. doi: 10.1093/bib/bbx026. Brief Bioinform. 2018. PMID: 28334268 Review.
Cited by
-
The pattern of DNA cleavage intensity around indels.Sci Rep. 2015 Feb 9;5:8333. doi: 10.1038/srep08333. Sci Rep. 2015. PMID: 25660536 Free PMC article.
-
Functional implications of local DNA structures in regulatory motifs.ScientificWorldJournal. 2013 May 14;2013:965752. doi: 10.1155/2013/965752. Print 2013. ScientificWorldJournal. 2013. PMID: 23766731 Free PMC article.
-
Screening for protein-DNA interactions by automatable DNA-protein interaction ELISA.PLoS One. 2013 Oct 11;8(10):e75177. doi: 10.1371/journal.pone.0075177. eCollection 2013. PLoS One. 2013. PMID: 24146751 Free PMC article.
-
Binding of nucleoid-associated protein fis to DNA is regulated by DNA breathing dynamics.PLoS Comput Biol. 2013;9(1):e1002881. doi: 10.1371/journal.pcbi.1002881. Epub 2013 Jan 17. PLoS Comput Biol. 2013. PMID: 23341768 Free PMC article.
-
Unveiling DNA structural features of promoters associated with various types of TSSs in prokaryotic transcriptomes and their role in gene expression.DNA Res. 2017 Feb 1;24(1):25-35. doi: 10.1093/dnares/dsw045. DNA Res. 2017. PMID: 27803028 Free PMC article.
References
-
- Holtz WJ, Keasling JD. Engineering static and dynamic control of synthetic pathways. Cell. 2010;140:19–23. - PubMed
-
- Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316:1497–1502. - PubMed
-
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods. 2007;4:651–657. - PubMed
-
- Stormo GD, Zhao Y. Determining the specificity of protein-DNA interactions. Nat. Rev. Genet. 2010;11:751–760. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
Miscellaneous