Improvement of protein secondary structure prediction using binary word encoding
- PMID: 9037710
- DOI: 10.1002/(sici)1097-0134(199701)27:1<36::aid-prot5>3.0.co;2-l
Improvement of protein secondary structure prediction using binary word encoding
Abstract
We propose a binary word encoding to improve the protein secondary structure prediction. A binary word encoding encodes a local amino acid sequence to a binary word, which consists of 0 or 1. We use an encoding function to map an amino acid to 0 or 1. Using the binary word encoding, we can statistically extract the multiresidue information, which depends on more than one residue. We combine the binary word encoding with the GOR method, its modified version, which shows better accuracy, and the neural network method. The binary word encoding improves the accuracy of GOR by 2.8%. We obtain similar improvement when we combine this with the modified GOR method and the neural network method. When we use multiple sequence alignment data, the binary word encoding similarly improves the accuracy. The accuracy of our best combined method is 68.2%. In this paper, we only show improvement of the GOR and neural network method, we cannot say that the encoding improves the other methods. But the improvement by the encoding suggests that the multiresidue interaction affects the formation of secondary structure. In addition, we find that the optimal encoding function obtained by the simulated annealing method relates to nonpolarity. This means that nonpolarity is important to the multiresidue interaction.
Similar articles
-
The GOR Method of Protein Secondary Structure Prediction and Its Application as a Protein Aggregation Prediction Tool.Methods Mol Biol. 2017;1484:7-24. doi: 10.1007/978-1-4939-6406-2_2. Methods Mol Biol. 2017. PMID: 27787816
-
Prediction of protein secondary structure by neural networks: encoding short and long range patterns of amino acid packing.Acta Biochim Pol. 1992;39(4):369-92. Acta Biochim Pol. 1992. PMID: 1293893
-
Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments.J Comput Biol. 1996 Spring;3(1):163-83. doi: 10.1089/cmb.1996.3.163. J Comput Biol. 1996. PMID: 8697234
-
Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence.Proteins. 2002 Nov 1;49(2):154-66. doi: 10.1002/prot.10181. Proteins. 2002. PMID: 12210997
-
Computational methods for protein secondary structure prediction using multiple sequence alignments.Curr Protein Pept Sci. 2000 Nov;1(3):273-301. doi: 10.2174/1389203003381324. Curr Protein Pept Sci. 2000. PMID: 12369910 Review.
Cited by
-
Deciphering the structural code for proteins: helical propensities in domain classes and statistical multiresidue information in alpha-helices.Protein Sci. 1998 Jun;7(6):1368-79. doi: 10.1002/pro.5560070613. Protein Sci. 1998. PMID: 9655341 Free PMC article.
-
Cascaded multiple classifiers for secondary structure prediction.Protein Sci. 2000 Jun;9(6):1162-76. doi: 10.1110/ps.9.6.1162. Protein Sci. 2000. PMID: 10892809 Free PMC article.
-
Profile conditional random fields for modeling protein families with structural information.Biophysics (Nagoya-shi). 2009 May 30;5:37-44. doi: 10.2142/biophysics.5.37. eCollection 2009. Biophysics (Nagoya-shi). 2009. PMID: 27857577 Free PMC article.
-
Grass carp reovirus-GD108 fiber protein is involved in cell attachment.Virus Genes. 2017 Aug;53(4):613-622. doi: 10.1007/s11262-017-1467-6. Epub 2017 May 26. Virus Genes. 2017. PMID: 28550501
MeSH terms
LinkOut - more resources
Full Text Sources