The value of position-specific priors in motif discovery using MEME
- PMID: 20380693
- PMCID: PMC2868008
- DOI: 10.1186/1471-2105-11-179
The value of position-specific priors in motif discovery using MEME
Abstract
Background: Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types-including sequence conservation, nucleosome positioning, and negative examples-can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM).
Results: We extend the popular EM-based MEME algorithm to utilize position-specific priors and demonstrate their effectiveness for discovering transcription factor (TF) motifs in yeast and mouse DNA sequences. Utilizing a discriminative, conservation-based prior dramatically improves MEME's ability to discover motifs in 156 yeast TF ChIP-chip datasets, more than doubling the number of datasets where it finds the correct motif. On these datasets, MEME using the prior has a higher success rate than eight other conservation-based motif discovery approaches. We also show that the same type of prior improves the accuracy of motifs discovered by MEME in mouse TF ChIP-seq data, and that the motifs tend to be of slightly higher quality those found by a Gibbs sampling algorithm using the same prior.
Conclusions: We conclude that using position-specific priors can substantially increase the power of EM-based motif discovery algorithms such as MEME algorithm.
Figures
prior. The inter-motif distance (scaled Euclidean distance) is computed as described in Additional file 1.References
-
- Barash Y, Bejerano G, Friedman N. In: Algorithms in Bioinformatics: Proc. First International Workshop, no. 2149 in LNCS. Gascuel O, Moret BME, editor. 2001. A simple hyper-geometric approach for discovering putative transcription factor binding sites; pp. 278–293.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous
