Pair hidden Markov models on tree structures
- PMID: 12855464
- DOI: 10.1093/bioinformatics/btg1032
Pair hidden Markov models on tree structures
Abstract
Motivation: Computationally identifying non-coding RNA regions on the genome has much scope for investigation and is essentially harder than gene-finding problems for protein-coding regions. Since comparative sequence analysis is effective for non-coding RNA detection, efficient computational methods are expected for structural alignments of RNA sequences. On the other hand, Hidden Markov Models (HMMs) have played important roles for modeling and analysing biological sequences. Especially, the concept of Pair HMMs (PHMMs) have been examined extensively as mathematical models for alignments and gene finding.
Results: We propose the pair HMMs on tree structures (PHMMTSs), which is an extension of PHMMs defined on alignments of trees and provides a unifying framework and an automata-theoretic model for alignments of trees, structural alignments and pair stochastic context-free grammars. By structural alignment, we mean a pairwise alignment to align an unfolded RNA sequence into an RNA sequence of known secondary structure. First, we extend the notion of PHMMs defined on alignments of 'linear' sequences to pair stochastic tree automata, called PHMMTSs, defined on alignments of 'trees'. The PHMMTSs provide various types of alignments of trees such as affine-gap alignments of trees and an automata-theoretic model for alignment of trees. Second, based on the observation that a secondary structure of RNA can be represented by a tree, we apply PHMMTSs to the problem of structural alignments of RNAs. We modify PHMMTSs so that it takes as input a pair of a 'linear' sequence and a 'tree' representing a secondary structure of RNA to produce a structural alignment. Further, the PHMMTSs with input of a pair of two linear sequences is mathematically equal to the pair stochastic context-free grammars. We demonstrate some computational experiments to show the effectiveness of our method for structural alignments, and discuss a complexity issue of PHMMTSs.
Similar articles
-
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures.Proc IEEE Comput Syst Bioinform Conf. 2004:290-9. Proc IEEE Comput Syst Bioinform Conf. 2004. PMID: 16448022
-
Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures.Bioinformatics. 2005 Jun 1;21(11):2611-7. doi: 10.1093/bioinformatics/bti385. Epub 2005 Mar 22. Bioinformatics. 2005. PMID: 15784748
-
Stochastic context-free grammars for tRNA modeling.Nucleic Acids Res. 1994 Nov 25;22(23):5112-20. doi: 10.1093/nar/22.23.5112. Nucleic Acids Res. 1994. PMID: 7800507 Free PMC article.
-
Energy-based RNA consensus secondary structure prediction in multiple sequence alignments.Methods Mol Biol. 2014;1097:125-41. doi: 10.1007/978-1-62703-709-9_7. Methods Mol Biol. 2014. PMID: 24639158 Review.
-
Hidden Markov Models, grammars, and biology: a tutorial.J Bioinform Comput Biol. 2005 Apr;3(2):491-526. doi: 10.1142/s0219720005001077. J Bioinform Comput Biol. 2005. PMID: 15852517 Review.
Cited by
-
Hidden Markov Models and their Applications in Biological Sequence Analysis.Curr Genomics. 2009 Sep;10(6):402-15. doi: 10.2174/138920209789177575. Curr Genomics. 2009. PMID: 20190955 Free PMC article.
-
Evolutionary triplet models of structured RNA.PLoS Comput Biol. 2009 Aug;5(8):e1000483. doi: 10.1371/journal.pcbi.1000483. Epub 2009 Aug 28. PLoS Comput Biol. 2009. PMID: 19714212 Free PMC article.
-
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.BMC Bioinformatics. 2007 Jul 27;8:271. doi: 10.1186/1471-2105-8-271. BMC Bioinformatics. 2007. PMID: 17662141 Free PMC article.
-
PSSMTS: position specific scoring matrices on tree structures.J Math Biol. 2008 Jan;56(1-2):201-14. doi: 10.1007/s00285-007-0108-4. Epub 2007 Jul 7. J Math Biol. 2008. PMID: 17619192
-
Informatic resources for identifying and annotating structural RNA motifs.Mol Biotechnol. 2009 Feb;41(2):180-93. doi: 10.1007/s12033-008-9114-z. Epub 2008 Nov 1. Mol Biotechnol. 2009. PMID: 18979204 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous