Stem kernels for RNA sequence analyses
- PMID: 17933013
- DOI: 10.1142/s0219720007003028
Stem kernels for RNA sequence analyses
Abstract
Several computational methods based on stochastic context-free grammars have been developed for modeling and analyzing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNA, and are used for structural alignment of RNA sequences. However, such stochastic models cannot sufficiently discriminate member sequences of an RNA family from nonmembers and hence detect noncoding RNA regions from genome sequences. A novel kernel function, stem kernel, for the discrimination and detection of functional RNA sequences using support vector machines (SVMs) is proposed. The stem kernel is a natural extension of the string kernel, specifically the all-subsequences kernel, and is tailored to measure the similarity of two RNA sequences from the viewpoint of secondary structures. The stem kernel examines all possible common base pairs and stem structures of arbitrary lengths, including pseudoknots between two RNA sequences, and calculates the inner product of common stem structure counts. An efficient algorithm is developed to calculate the stem kernels based on dynamic programming. The stem kernels are then applied to discriminate members of an RNA family from nonmembers using SVMs. The study indicates that the discrimination ability of the stem kernel is strong compared with conventional methods. Furthermore, the potential application of the stem kernel is demonstrated by the detection of remotely homologous RNA families in terms of secondary structures. This is because the string kernel is proven to work for the remote homology detection of protein sequences. These experimental results have convinced us to apply the stem kernel in order to find novel RNA families from genome sequences.
Similar articles
-
Marginalized kernels for RNA sequence data analysis.Genome Inform. 2002;13:112-22. Genome Inform. 2002. PMID: 14571380
-
Genome-wide searching with base-pairing kernel functions for noncoding RNAs: computational and expression analysis of snoRNA families in Caenorhabditis elegans.Nucleic Acids Res. 2009 Feb;37(3):999-1009. doi: 10.1093/nar/gkn1054. Epub 2009 Jan 7. Nucleic Acids Res. 2009. PMID: 19129214 Free PMC article.
-
Memory efficient alignment between RNA sequences and stochastic grammar models of pseudoknots.Int J Bioinform Res Appl. 2006;2(3):289-304. doi: 10.1504/IJBRA.2006.010606. Int J Bioinform Res Appl. 2006. PMID: 18048167
-
tRNA-like structures and their functions.FEBS J. 2022 Sep;289(17):5089-5099. doi: 10.1111/febs.16070. Epub 2021 Jun 24. FEBS J. 2022. PMID: 34117728 Review.
-
Comparative sequence analysis of tmRNA.Nucleic Acids Res. 1999 May 15;27(10):2063-71. doi: 10.1093/nar/27.10.2063. Nucleic Acids Res. 1999. PMID: 10219077 Free PMC article. Review.
Cited by
-
Accurate Classification of RNA Structures Using Topological Fingerprints.PLoS One. 2016 Oct 18;11(10):e0164726. doi: 10.1371/journal.pone.0164726. eCollection 2016. PLoS One. 2016. PMID: 27755571 Free PMC article.
-
Software.ncrna.org: web servers for analyses of RNA sequences.Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W75-8. doi: 10.1093/nar/gkn222. Epub 2008 Apr 25. Nucleic Acids Res. 2008. PMID: 18440970 Free PMC article.
-
Informatic resources for identifying and annotating structural RNA motifs.Mol Biotechnol. 2009 Feb;41(2):180-93. doi: 10.1007/s12033-008-9114-z. Epub 2008 Nov 1. Mol Biotechnol. 2009. PMID: 18979204 Free PMC article. Review.
-
DMfold: A Novel Method to Predict RNA Secondary Structure With Pseudoknots Based on Deep Learning and Improved Base Pair Maximization Principle.Front Genet. 2019 Mar 4;10:143. doi: 10.3389/fgene.2019.00143. eCollection 2019. Front Genet. 2019. PMID: 30886627 Free PMC article.
-
Directed acyclic graph kernels for structural RNA analysis.BMC Bioinformatics. 2008 Jul 22;9:318. doi: 10.1186/1471-2105-9-318. BMC Bioinformatics. 2008. PMID: 18647390 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources