MotEvo: integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences
- PMID: 22334039
- DOI: 10.1093/bioinformatics/btr695
MotEvo: integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences
Abstract
Motivation: Probabilistic approaches for inferring transcription factor binding sites (TFBSs) and regulatory motifs from DNA sequences have been developed for over two decades. Previous work has shown that prediction accuracy can be significantly improved by incorporating features such as the competition of multiple transcription factors (TFs) for binding to nearby sites, the tendency of TFBSs for co-regulated TFs to cluster and form cis-regulatory modules and explicit evolutionary modeling of conservation of TFBSs across orthologous sequences. However, currently available tools only incorporate some of these features, and significant methodological hurdles hampered their synthesis into a single consistent probabilistic framework.
Results: We present MotEvo, a integrated suite of Bayesian probabilistic methods for the prediction of TFBSs and inference of regulatory motifs from multiple alignments of phylogenetically related DNA sequences, which incorporates all features just mentioned. In addition, MotEvo incorporates a novel model for detecting unknown functional elements that are under evolutionary constraint, and a new robust model for treating gain and loss of TFBSs along a phylogeny. Rigorous benchmarking tests on ChIP-seq datasets show that MotEvo's novel features significantly improve the accuracy of TFBS prediction, motif inference and enhancer prediction.
Availability: Source code, a user manual and files with several example applications are available at www.swissregulon.unibas.ch.
Similar articles
-
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9. PLoS Comput Biol. 2005. PMID: 16477324 Free PMC article.
-
SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates.Nucleic Acids Res. 2013 Jan;41(Database issue):D214-20. doi: 10.1093/nar/gks1145. Epub 2012 Nov 24. Nucleic Acids Res. 2013. PMID: 23180783 Free PMC article.
-
Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites.BMC Bioinformatics. 2008 Jun 4;9:262. doi: 10.1186/1471-2105-9-262. BMC Bioinformatics. 2008. PMID: 18533028 Free PMC article.
-
Finding regulatory elements and regulatory motifs: a general probabilistic framework.BMC Bioinformatics. 2007 Sep 27;8 Suppl 6(Suppl 6):S4. doi: 10.1186/1471-2105-8-S6-S4. BMC Bioinformatics. 2007. PMID: 17903285 Free PMC article. Review.
-
Genetic Variants in Transcription Factor Binding Sites in Humans: Triggered by Natural Selection and Triggers of Diseases.Int J Mol Sci. 2021 Apr 18;22(8):4187. doi: 10.3390/ijms22084187. Int J Mol Sci. 2021. PMID: 33919522 Free PMC article. Review.
Cited by
-
Vitamin D Receptor Regulates the Expression of the Grainyhead-Like 1 Gene.Int J Mol Sci. 2024 Jul 19;25(14):7913. doi: 10.3390/ijms25147913. Int J Mol Sci. 2024. PMID: 39063155 Free PMC article.
-
Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs.Genome Res. 2019 Jul;29(7):1164-1177. doi: 10.1101/gr.239319.118. Epub 2019 May 28. Genome Res. 2019. PMID: 31138617 Free PMC article.
-
An integrated expression atlas of miRNAs and their promoters in human and mouse.Nat Biotechnol. 2017 Sep;35(9):872-878. doi: 10.1038/nbt.3947. Epub 2017 Aug 21. Nat Biotechnol. 2017. PMID: 28829439 Free PMC article.
-
Identification of a multi-cancer gene expression biomarker for cancer clinical outcomes using a network-based algorithm.Sci Rep. 2015 Jul 23;5:11966. doi: 10.1038/srep11966. Sci Rep. 2015. PMID: 26202601 Free PMC article.
-
A flexible repertoire of transcription factor binding sites and a diversity threshold determines enhancer activity in embryonic stem cells.Genome Res. 2021 Apr;31(4):564-575. doi: 10.1101/gr.272468.120. Epub 2021 Mar 12. Genome Res. 2021. PMID: 33712417 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources