PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling
- PMID: 18769735
- PMCID: PMC2518514
- DOI: 10.1371/journal.pcbi.1000156
PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling
Abstract
PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules-tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other "discriminative motif-finders" have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use "informative priors" on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data.
Conflict of interest statement
The author has declared that no competing interests exist.
Figures










Similar articles
-
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9. PLoS Comput Biol. 2005. PMID: 16477324 Free PMC article.
-
Parsing regulatory DNA: general tasks, techniques, and the PhyloGibbs approach.J Biosci. 2007 Aug;32(5):863-70. doi: 10.1007/s12038-007-0086-0. J Biosci. 2007. PMID: 17914228 Review.
-
De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047. BMC Genomics. 2014. PMID: 25442502 Free PMC article.
-
Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo.BMC Bioinformatics. 2002 Oct 24;3:30. doi: 10.1186/1471-2105-3-30. Epub 2002 Oct 24. BMC Bioinformatics. 2002. PMID: 12398796 Free PMC article.
-
Finding regulatory elements and regulatory motifs: a general probabilistic framework.BMC Bioinformatics. 2007 Sep 27;8 Suppl 6(Suppl 6):S4. doi: 10.1186/1471-2105-8-S6-S4. BMC Bioinformatics. 2007. PMID: 17903285 Free PMC article. Review.
Cited by
-
TFforge utilizes large-scale binding site divergence to identify transcriptional regulators involved in phenotypic differences.Nucleic Acids Res. 2019 Feb 28;47(4):e19. doi: 10.1093/nar/gky1200. Nucleic Acids Res. 2019. PMID: 30496469 Free PMC article.
-
Recent computational developments on CLIP-seq data analysis and microRNA targeting implications.Brief Bioinform. 2018 Nov 27;19(6):1290-1301. doi: 10.1093/bib/bbx063. Brief Bioinform. 2018. PMID: 28605404 Free PMC article. Review.
-
The complex spatio-temporal regulation of the Drosophila myoblast attractant gene duf/kirre.PLoS One. 2009 Sep 9;4(9):e6960. doi: 10.1371/journal.pone.0006960. PLoS One. 2009. PMID: 19742310 Free PMC article.
-
THiCweed: fast, sensitive detection of sequence features by clustering big datasets.Nucleic Acids Res. 2018 Mar 16;46(5):e29. doi: 10.1093/nar/gkx1251. Nucleic Acids Res. 2018. PMID: 29267972 Free PMC article.
-
Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.PLoS Comput Biol. 2010 Dec 16;6(12):e1001037. doi: 10.1371/journal.pcbi.1001037. PLoS Comput Biol. 2010. PMID: 21187896 Free PMC article.
References
-
- Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993;262:208–214. - PubMed
-
- Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994;2:28–36. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases