Classification and assessment tools for structural motif discovery algorithms
- PMID: 23902564
- PMCID: PMC3698030
- DOI: 10.1186/1471-2105-14-S9-S4
Classification and assessment tools for structural motif discovery algorithms
Abstract
Background: Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case.
Methods: In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery.
Results: Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures.
Conclusions: We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.
Figures















Similar articles
-
IncMD: incremental trie-based structural motif discovery algorithm.J Bioinform Comput Biol. 2014 Oct;12(5):1450027. doi: 10.1142/S0219720014500279. J Bioinform Comput Biol. 2014. PMID: 25362841
-
A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs.BMC Bioinformatics. 2012 Nov 27;13:317. doi: 10.1186/1471-2105-13-317. BMC Bioinformatics. 2012. PMID: 23181585 Free PMC article.
-
A new algorithm for DNA motif discovery using multiple sample sequence sets.J Bioinform Comput Biol. 2019 Aug;17(4):1950021. doi: 10.1142/S0219720019500215. J Bioinform Comput Biol. 2019. PMID: 31617465
-
Discovering sequence motifs.Methods Mol Biol. 2008;452:231-51. doi: 10.1007/978-1-60327-159-2_12. Methods Mol Biol. 2008. PMID: 18566768 Review.
-
A comparative benchmark of classic DNA motif discovery tools on synthetic data.Brief Bioinform. 2021 Nov 5;22(6):bbab303. doi: 10.1093/bib/bbab303. Brief Bioinform. 2021. PMID: 34351399 Review.
Cited by
-
A novel method for the identification of conserved structural patterns in RNA: From small scale to high-throughput applications.Nucleic Acids Res. 2016 Oct 14;44(18):8600-8609. doi: 10.1093/nar/gkw750. Epub 2016 Aug 31. Nucleic Acids Res. 2016. PMID: 27580722 Free PMC article.
References
-
- Badr G, Turcotte M. Proceedings of the 7th international conference on Bioinformatics research and applications. ISBRA'11, Berlin, Heidelberg: Springer-Verlag; 2011. Component-based matching for multiple interacting RNA sequences; pp. 73–86.
-
- Sung W. RNA Secondary Structure Prediction. The practical bioinformatician. 2004. pp. 167–192. World Scientific.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous