Memetic algorithms for de novo motif-finding in biomedical sequences
- PMID: 22613029
- DOI: 10.1016/j.artmed.2012.04.002
Memetic algorithms for de novo motif-finding in biomedical sequences
Abstract
Objectives: The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences.
Methods and materials: In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences.
Results: The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary microRNA sequences.
Conclusions: The memetic motif-finding algorithm is effectively designed and implemented, and its applications demonstrate it is not only time-efficient, but also exhibits excellent performance while compared with other popular algorithms.
Copyright © 2012 Elsevier B.V. All rights reserved.
Similar articles
-
A Monte Carlo EM algorithm for de novo motif discovery in biomolecular sequences.IEEE/ACM Trans Comput Biol Bioinform. 2009 Jul-Sep;6(3):370-86. doi: 10.1109/TCBB.2008.103. IEEE/ACM Trans Comput Biol Bioinform. 2009. PMID: 19644166
-
Informative priors based on transcription factor structural class improve de novo motif discovery.Bioinformatics. 2006 Jul 15;22(14):e384-92. doi: 10.1093/bioinformatics/btl251. Bioinformatics. 2006. PMID: 16873497
-
MotifCut: regulatory motifs finding with maximum density subgraphs.Bioinformatics. 2006 Jul 15;22(14):e150-7. doi: 10.1093/bioinformatics/btl243. Bioinformatics. 2006. PMID: 16873465
-
A comparative study on computational two-block motif detection: algorithms and applications.Mol Pharm. 2008 Jan-Feb;5(1):3-16. doi: 10.1021/mp7001126. Epub 2007 Dec 13. Mol Pharm. 2008. PMID: 18076137 Review.
-
An algorithmic perspective of de novo cis-regulatory motif finding based on ChIP-seq data.Brief Bioinform. 2018 Sep 28;19(5):1069-1081. doi: 10.1093/bib/bbx026. Brief Bioinform. 2018. PMID: 28334268 Review.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials