Combining phylogenetic and hidden Markov models in biosequence analysis
- PMID: 15285899
- DOI: 10.1089/1066527041410472
Combining phylogenetic and hidden Markov models in biosequence analysis
Abstract
A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individual sites, and hidden Markov models, which allow for changes from site to site. Besides improving the realism of ordinary phylogenetic models, they are potentially very powerful tools for inference and prediction--for example, for gene finding or prediction of secondary structure. In this paper, we review progress on combined phylogenetic and hidden Markov models and present some extensions to previous work. Our main result is a simple and efficient method for accommodating higher-order states in the HMM, which allows for context-dependent models of substitution--that is, models that consider the effects of neighboring bases on the pattern of substitution. We present experimental results indicating that higher-order states, autocorrelated rates, and multiple functional categories all lead to significant improvements in the fit of a combined phylogenetic and hidden Markov model, with the effect of higher-order states being particularly pronounced.
Similar articles
-
Applications of hidden Markov models for characterization of homologous DNA sequences with a common gene.J Comput Biol. 2005 Mar;12(2):186-203. doi: 10.1089/cmb.2005.12.186. J Comput Biol. 2005. PMID: 15767776
-
Using hidden Markov models and observed evolution to annotate viral genomes.Bioinformatics. 2006 Jun 1;22(11):1308-16. doi: 10.1093/bioinformatics/btl092. Epub 2006 Apr 13. Bioinformatics. 2006. PMID: 16613911
-
Enhancing the quality of phylogenetic analysis using fuzzy hidden Markov model alignments.Stud Health Technol Inform. 2007;129(Pt 2):1245-9. Stud Health Technol Inform. 2007. PMID: 17911914
-
Computational advances in maximum likelihood methods for molecular phylogeny.Genome Res. 1998 Mar;8(3):222-33. doi: 10.1101/gr.8.3.222. Genome Res. 1998. PMID: 9521926 Review.
-
Computational approaches to gene prediction.J Microbiol. 2006 Apr;44(2):137-44. J Microbiol. 2006. PMID: 16728949 Review.
Cited by
-
Adaptive evolution and the birth of CTCF binding sites in the Drosophila genome.PLoS Biol. 2012;10(11):e1001420. doi: 10.1371/journal.pbio.1001420. Epub 2012 Nov 6. PLoS Biol. 2012. PMID: 23139640 Free PMC article.
-
A systematic study of gene expression variation at single-nucleotide resolution reveals widespread regulatory roles for uAUGs.Genome Res. 2012 Jun;22(6):1089-97. doi: 10.1101/gr.117366.110. Epub 2012 Mar 27. Genome Res. 2012. PMID: 22454232 Free PMC article.
-
ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.Genome Res. 2006 Dec;16(12):1596-604. doi: 10.1101/gr.4537706. Epub 2006 Oct 19. Genome Res. 2006. PMID: 17053093 Free PMC article.
-
G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae.PLoS Comput Biol. 2010 Jul 22;6(7):e1000861. doi: 10.1371/journal.pcbi.1000861. PLoS Comput Biol. 2010. PMID: 20676380 Free PMC article.
-
Evolution and functional classification of vertebrate gene deserts.Genome Res. 2005 Jan;15(1):137-45. doi: 10.1101/gr.3015505. Epub 2004 Dec 8. Genome Res. 2005. PMID: 15590943 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources