Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 14:9:163-84.
doi: 10.4137/EBO.S10580. Print 2013.

A Tool Preference Choice Method for RNA Secondary Structure Prediction by SVM with Statistical Tests

Affiliations

A Tool Preference Choice Method for RNA Secondary Structure Prediction by SVM with Statistical Tests

Chiou-Yi Hor et al. Evol Bioinform Online. .

Abstract

The Prediction of RNA secondary structures has drawn much attention from both biologists and computer scientists. Many useful tools have been developed for this purpose. These tools have their individual strengths and weaknesses. As a result, based on support vector machines (SVM), we propose a tool choice method which integrates three prediction tools: pknotsRG, RNAStructure, and NUPACK. Our method first extracts features from the target RNA sequence, and adopts two information-theoretic feature selection methods for feature ranking. We propose a method to combine feature selection and classifier fusion in an incremental manner. Our test data set contains 720 RNA sequences, where 225 pseudoknotted RNA sequences are obtained from PseudoBase, and 495 nested RNA sequences are obtained from RNA SSTRAND. The method serves as a preprocessing way in analyzing RNA sequences before the RNA secondary structure prediction tools are employed. In addition, the performance of various configurations is subject to statistical tests to examine their significance. The best base-pair accuracy achieved is 75.5%, which is obtained by the proposed incremental method, and is significantly higher than 68.8%, which is associated with the best predictor, pknotsRG.

Keywords: RNA; feature selection; secondary structure; statistical test; support vector machine.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The nested (top) and pseudoknotted (bottom) bonded RNA structures.

Similar articles

Cited by

References

    1. Huang CD, Lin CT, Pal NR. Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification. IEEE Trans Nanobioscience. 2003;2(4):221–32. - PubMed
    1. Wang J, Zhang Y. Characterization and similarity analysis of DNA sequences based on mutually direct-complementary triplets. Chem Phys Lett. 2006;426(4–6):324–8.
    1. Hu MK. Visual pattern recognition by moment invariants. IRE Transactions on Information Theory. 1962;8(2):179–87.
    1. Percival DB, Walden AT. Wavelet Methods for Time Series Analysis (Cambridge Series in Statistical and Probabilistic Mathematics) New York: Cambridge University Press; 2000.
    1. Gupta R, Mittal A, Singh K. A time-series-based feature extraction approach for prediction of protein structural class. EURASIP J Bioinform Syst Biol. 2008;235451 - PMC - PubMed