Analysis of an optimal hidden Markov model for secondary structure prediction
- PMID: 17166267
- PMCID: PMC1769381
- DOI: 10.1186/1472-6807-6-25
Analysis of an optimal hidden Markov model for secondary structure prediction
Abstract
Background: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models.
Results: Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model alpha-helices, 12 that model coil and 9 that model beta-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%.
Conclusion: The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.
Figures





Similar articles
-
HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins.J Mol Biol. 2000 Aug 4;301(1):173-90. doi: 10.1006/jmbi.2000.3837. J Mol Biol. 2000. PMID: 10926500
-
Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method.BMC Bioinformatics. 2005 Jan 12;6:7. doi: 10.1186/1471-2105-6-7. BMC Bioinformatics. 2005. PMID: 15647112 Free PMC article.
-
Sequence-based protein structure prediction using a reduced state-space hidden Markov model.Comput Biol Med. 2007 Sep;37(9):1211-24. doi: 10.1016/j.compbiomed.2006.10.014. Epub 2006 Dec 11. Comput Biol Med. 2007. PMID: 17161834
-
Hidden Markov Models for prediction of protein features.Methods Mol Biol. 2008;413:173-98. doi: 10.1007/978-1-59745-574-9_7. Methods Mol Biol. 2008. PMID: 18075166 Review.
-
The prediction of amphiphilic alpha-helices.Curr Protein Pept Sci. 2002 Apr;3(2):201-21. doi: 10.2174/1389203024605368. Curr Protein Pept Sci. 2002. PMID: 12188904 Review.
Cited by
-
Complementarity of the residue-level protein function and structure predictions in human proteins.Comput Struct Biotechnol J. 2022 May 6;20:2223-2234. doi: 10.1016/j.csbj.2022.05.003. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35615015 Free PMC article.
-
MHTAPred-SS: A Highly Targeted Autoencoder-Driven Deep Multi-Task Learning Framework for Accurate Protein Secondary Structure Prediction.Int J Mol Sci. 2024 Dec 15;25(24):13444. doi: 10.3390/ijms252413444. Int J Mol Sci. 2024. PMID: 39769208 Free PMC article.
-
An evolutionary method for learning HMM structure: prediction of protein secondary structure.BMC Bioinformatics. 2007 Sep 21;8:357. doi: 10.1186/1471-2105-8-357. BMC Bioinformatics. 2007. PMID: 17888163 Free PMC article.
-
Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data.Algorithms Mol Biol. 2010 Jan 26;5:15. doi: 10.1186/1748-7188-5-15. Algorithms Mol Biol. 2010. PMID: 20205909 Free PMC article.
-
Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach.BMC Bioinformatics. 2016 Sep 13;17(1):362. doi: 10.1186/s12859-016-1209-0. BMC Bioinformatics. 2016. PMID: 27618812 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources