Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 May 2:15:124.
doi: 10.1186/1471-2105-15-124.

The discriminant power of RNA features for pre-miRNA recognition

Affiliations

The discriminant power of RNA features for pre-miRNA recognition

Ivani de O N Lopes et al. BMC Bioinformatics. .

Abstract

Background: Computational discovery of microRNAs (miRNA) is based on pre-determined sets of features from miRNA precursors (pre-miRNA). Some feature sets are composed of sequence-structure patterns commonly found in pre-miRNAs, while others are a combination of more sophisticated RNA features. In this work, we analyze the discriminant power of seven feature sets, which are used in six pre-miRNA prediction tools. The analysis is based on the classification performance achieved with these feature sets for the training algorithms used in these tools. We also evaluate feature discrimination through the F-score and feature importance in the induction of random forests.

Results: Small or non-significant differences were found among the estimated classification performances of classifiers induced using sets with diversification of features, despite the wide differences in their dimension. Inspired in these results, we obtained a lower-dimensional feature set, which achieved a sensitivity of 90% and a specificity of 95%. These estimates are within 0.1% of the maximal values obtained with any feature set (SELECT, Section "Results and discussion") while it is 34 times faster to compute. Even compared to another feature set (FS2, see Section "Results and discussion"), which is the computationally least expensive feature set of those from the literature which perform within 0.1% of the maximal values, it is 34 times faster to compute. The results obtained by the tools used as references in the experiments carried out showed that five out of these six tools have lower sensitivity or specificity.

Conclusion: In miRNA discovery the number of putative miRNA loci is in the order of millions. Analysis of putative pre-miRNAs using a computationally expensive feature set would be wasteful or even unfeasible for large genomes. In this work, we propose a relatively inexpensive feature set and explore most of the learning aspects implemented in current ab-initio pre-miRNA prediction tools, which may lead to the development of efficient ab-initio pre-miRNA discovery tools.The material to reproduce the main results from this paper can be downloaded from http://bioinformatics.rutgers.edu/Static/Software/discriminant.tar.gz.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Average feature importance estimated during the induction of RF ensembles. Features with importance lower than five were omitted. The average feature importance drops-off quickly after the 10th feature, indicating that for each ensemble there are few distinguishing features.
Figure 2
Figure 2
Predictive performance of classifiers throughout 12.5%-quantile distribution of G+C content. The prediction of the secondary structure of G+C-rich sequences is more challenging. This figure shows that the classification of G+C-rich pre-miRNA sequences is also more complex. As the G+C content increased, the sensitivity dropped, except when SVM was trained with feature sets including %G+C-based features (FS1, FS2 and FS7).

Similar articles

Cited by

References

    1. Khorshid M, Hausser J, Zavolan M, van Nimwegen E. A biophysical miRNA-mRNA interaction model infers canonical and noncanonical targets. Nat Methods. 2013;10(3):253–255. doi: 10.1038/nmeth.2341. [ http://dx.doi.org/10.1038/nmeth.2341] - DOI - DOI - PubMed
    1. Letzen BS, Liu C, Thakor NV, Gearhart JD, All AH, Kerr CL. MicroRNA expression profiling of oligodendrocyte differentiation from human embryonic stem cells. PLoS One. 2010;5(5):e10480. doi: 10.1371/journal.pone.0010480. [ http://dx.plos.org/10.1371/journal.pone.0010480] - DOI - DOI - PMC - PubMed
    1. Cho WCS. MicroRNAs in cancer - from research to therapy. Biochimica et Biophysica Acta. 2010;1805(2):209–217. [ http://dx.doi.org/10.1016/j.bbcan.2009.11.003] - DOI - PubMed
    1. Taganov KD, Boldin MP, Baltimore D. MicroRNAs and immunity: tiny players in a big field. Immunity. 2007;26(2):133–137. doi: 10.1016/j.immuni.2007.02.005. [ http://dx.doi.org/10.1016/j.immuni.2007.02.005] - DOI - DOI - PubMed
    1. Burklew CE, Ashlock J, Winfrey WB, Zhang B. Effects of aluminum oxide nanoparticles on the growth, development, and microRNA expression of tobacco (Nicotiana tabacum) PloS One. 2012;7(5):e34783. doi: 10.1371/journal.pone.0034783. [ http://dx.plos.org/10.1371/journal.pone.0034783] - DOI - DOI - PMC - PubMed

Publication types

LinkOut - more resources