Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Dec 29:6:310.
doi: 10.1186/1471-2105-6-310.

Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine

Affiliations

Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine

Chenghai Xue et al. BMC Bioinformatics. .

Abstract

Background: MicroRNAs (miRNAs) are a group of short (approximately 22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology.

Results: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information.

Conclusion: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Using the triplet elements to represent the local structure-sequence features of the hairpin. The triplet element is composed of the 3 continuous sub-structures and the nucleotide type at the middle. The appearances of all 32 possible triplet elements are counted along a hairpin segment, forming a 32-dimensional vector, which is then normalized to be the input vector for SVM.
Figure 2
Figure 2
The average appearance frequencies of the triplet elements in the two classes (real pre-miRNA vs. pseudo-miRNA hairpins).

Similar articles

Cited by

References

    1. Bartel B, Bartel DP. MicroRNAs: at the root of plant development? Plant Physiol. 2003;132:709–717. doi: 10.1104/pp.103.023630. - DOI - PMC - PubMed
    1. Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116:281–297. doi: 10.1016/S0092-8674(04)00045-5. - DOI - PubMed
    1. Lee Y, Jeon K, Lee JT, Kim S, Kim VN. MicroRNA maturation: stepwise processing and subcellular localization. Embo J. 2002;21:4663–4670. doi: 10.1093/emboj/cdf476. - DOI - PMC - PubMed
    1. Lee Y, Ahn C, Han J, Choi H, Kim J, Yim J, Lee J, Provost P, Radmark O, Kim S, Kim VN. The nuclear RNase III Drosha initiates microRNA processing. Nature. 2003;425:415–419. doi: 10.1038/nature01957. - DOI - PubMed
    1. Kim VN. MicroRNA precursors in motion: exportin-5 mediates their nuclear export. Trends Cell Biol. 2004;14:156–159. doi: 10.1016/j.tcb.2004.02.006. - DOI - PubMed

Publication types

LinkOut - more resources