Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
- PMID: 16381612
- PMCID: PMC1360673
- DOI: 10.1186/1471-2105-6-310
Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine
Abstract
Background: MicroRNAs (miRNAs) are a group of short (approximately 22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology.
Results: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information.
Conclusion: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs.
Figures


Similar articles
-
De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures.Bioinformatics. 2007 Jun 1;23(11):1321-30. doi: 10.1093/bioinformatics/btm026. Epub 2007 Jan 31. Bioinformatics. 2007. PMID: 17267435
-
New syntax to describe local continuous structure-sequence information for recognizing new pre-miRNAs.J Theor Biol. 2010 May 21;264(2):578-84. doi: 10.1016/j.jtbi.2010.02.037. Epub 2010 Mar 2. J Theor Biol. 2010. PMID: 20202471
-
Identification of homologous microRNAs in 56 animal genomes.Genomics. 2010 Jul;96(1):1-9. doi: 10.1016/j.ygeno.2010.03.009. Epub 2010 Mar 27. Genomics. 2010. PMID: 20347954
-
Role of miRNA in carcinogenesis and biomarker selection: a methodological view.Expert Rev Mol Diagn. 2007 Sep;7(5):569-603. doi: 10.1586/14737159.7.5.569. Expert Rev Mol Diagn. 2007. PMID: 17892365 Review.
-
Computational methods for microRNA target prediction.Methods Enzymol. 2007;427:65-86. doi: 10.1016/S0076-6879(07)27004-1. Methods Enzymol. 2007. PMID: 17720479 Review.
Cited by
-
A Review of Computational Tools in microRNA Discovery.Front Genet. 2013 May 15;4:81. doi: 10.3389/fgene.2013.00081. eCollection 2013. Front Genet. 2013. PMID: 23720668 Free PMC article.
-
miReader: Discovering Novel miRNAs in Species without Sequenced Genome.PLoS One. 2013 Jun 21;8(6):e66857. doi: 10.1371/journal.pone.0066857. Print 2013. PLoS One. 2013. PMID: 23805282 Free PMC article.
-
Effective classification of microRNA precursors using feature mining and AdaBoost algorithms.OMICS. 2013 Sep;17(9):486-93. doi: 10.1089/omi.2013.0011. Epub 2013 Jun 29. OMICS. 2013. PMID: 23808606 Free PMC article.
-
Integrated sequence-structure motifs suffice to identify microRNA precursors.PLoS One. 2012;7(3):e32797. doi: 10.1371/journal.pone.0032797. Epub 2012 Mar 15. PLoS One. 2012. PMID: 22438883 Free PMC article.
-
repRNA: a web server for generating various feature vectors of RNA sequences.Mol Genet Genomics. 2016 Feb;291(1):473-81. doi: 10.1007/s00438-015-1078-7. Epub 2015 Jun 18. Mol Genet Genomics. 2016. PMID: 26085220
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials