Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures
- PMID: 16448022
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures
Abstract
Motivation: Since the whole genome sequences for many species are currently available, computational predictions of RNA secondary structures and computational identifications of those non-coding RNA regions by comparative genomics become important, and require more advanced alignment methods. Recently, an approach of structural alignments for RNA sequences has been introduced to solve these problems. By structural alignments, we mean a pairwise alignment to align an unfolded RNA sequence into a folded RNA sequence of known secondary structure. Pair HMMs on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignments of RNA secondary structures, but are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs) is a subclass of context-sensitive grammar, which is suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots.
Results: We propose the pair stochastic tree adjoining grammars (PSTAGs) for modeling RNA secondary structures including pseudoknots and show the strong experimental evidences that modeling pseudoknot structures significantly improves the prediction accuracies of RNA secondary structures. First, we extend the notion of PHMMTSs defined on alignments of 'trees' to PSTAGs defined on alignments of "TAG (derivation) trees", which represent a top-down parsing process of TAGs and are functionally equivalent to derived trees of TAGs. Second, we modify PSTAGs so that it takes as input a pair of a linear sequence and a TAG tree representing a pseudoknot structure of RNA to produce a structural alignment. Then, we develop a polynomial-time algorithm for obtaining an optimal structural alignment by PSTAGs, based on dynamic programming parser. We have done several computational experiments for predicting pseudoknots by PSTAGs, and our computational experiments suggests that prediction of RNA pseudoknot structures by our method are more efficient and biologically plausible than by other conventional methods. The binary code for PSTAG method is freely available from our website at http://www.dna.bio.keio.ac.jp/pstag/.
Similar articles
-
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures.Bioinformatics. 2005 Jun 1;21(11):2611-7. doi: 10.1093/bioinformatics/bti385. Epub 2005 Mar 22. Bioinformatics. 2005. PMID: 15784748
-
Pair hidden Markov models on tree structures.Bioinformatics. 2003;19 Suppl 1:i232-40. doi: 10.1093/bioinformatics/btg1032. Bioinformatics. 2003. PMID: 12855464
-
RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment.Bioinformatics. 2007 Aug 1;23(15):1883-91. doi: 10.1093/bioinformatics/btm272. Epub 2007 May 30. Bioinformatics. 2007. PMID: 17537756
-
Bridging the gap in RNA structure prediction.Curr Opin Struct Biol. 2007 Apr;17(2):157-65. doi: 10.1016/j.sbi.2007.03.001. Epub 2007 Mar 23. Curr Opin Struct Biol. 2007. PMID: 17383172 Review.
-
An overview of RNA structure prediction and applications to RNA gene prediction and RNAi design.Curr Protoc Bioinformatics. 2006 Mar;Chapter 12:Unit 12.1. doi: 10.1002/0471250953.bi1201s13. Curr Protoc Bioinformatics. 2006. PMID: 18428758 Review.
Cited by
-
Peptide vocabulary analysis reveals ultra-conservation and homonymity in protein sequences.Bioinform Biol Insights. 2009 Nov 24;1:101-26. doi: 10.4137/bbi.s415. Bioinform Biol Insights. 2009. PMID: 20066129 Free PMC article.
-
Informatic resources for identifying and annotating structural RNA motifs.Mol Biotechnol. 2009 Feb;41(2):180-93. doi: 10.1007/s12033-008-9114-z. Epub 2008 Nov 1. Mol Biotechnol. 2009. PMID: 18979204 Free PMC article. Review.