Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure
- PMID: 16106377
- DOI: 10.1002/prot.20630
Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure
Abstract
The present study is an attempt to develop a neural network-based method for predicting the real value of solvent accessibility from the sequence using evolutionary information in the form of multiple sequence alignment. In this method, two feed-forward networks with a single hidden layer have been trained with standard back-propagation as a learning algorithm. The Pearson's correlation coefficient increases from 0.53 to 0.63, and mean absolute error decreases from 18.2 to 16% when multiple-sequence alignment obtained from PSI-BLAST is used as input instead of a single sequence. The performance of the method further improves from a correlation coefficient of 0.63 to 0.67 when secondary structure information predicted by PSIPRED is incorporated in the prediction. The final network yields a mean absolute error value of 15.2% between the experimental and predicted values, when tested on two different nonhomologous and nonredundant datasets of varying sizes. The method consists of two steps: (1) in the first step, a sequence-to-structure network is trained with the multiple alignment profiles in the form of PSI-BLAST-generated position-specific scoring matrices, and (2) in the second step, the output obtained from the first network and PSIPRED-predicted secondary structure information is used as an input to the second structure-to-structure network. Based on the present study, a server SARpred (http://www.imtech.res.in/raghava/sarpred/) has been developed that predicts the real value of solvent accessibility of residues for a given protein sequence. We have also evaluated the performance of SARpred on 47 proteins used in CASP6 and achieved a correlation coefficient of 0.68 and a MAE of 15.9% between predicted and observed values.
Copyright 2005 Wiley-Liss, Inc.
Similar articles
-
A neural network method for prediction of beta-turn types in proteins using evolutionary information.Bioinformatics. 2004 Nov 1;20(16):2751-8. doi: 10.1093/bioinformatics/bth322. Epub 2004 May 14. Bioinformatics. 2004. PMID: 15145798
-
Role of evolutionary information in prediction of aromatic-backbone NH interactions in proteins.FEBS Lett. 2004 Apr 23;564(1-2):47-57. doi: 10.1016/S0014-5793(04)00305-9. FEBS Lett. 2004. PMID: 15094041
-
Prediction of alpha-turns in proteins using PSI-BLAST profiles and secondary structure information.Proteins. 2004 Apr 1;55(1):83-90. doi: 10.1002/prot.10569. Proteins. 2004. PMID: 14997542
-
BiRDS - Binding Residue Detection from Protein Sequences Using Deep ResNets.J Chem Inf Model. 2022 Apr 25;62(8):1809-1818. doi: 10.1021/acs.jcim.1c00972. Epub 2022 Apr 12. J Chem Inf Model. 2022. PMID: 35414182 Review.
-
Protein secondary structure prediction.Curr Opin Struct Biol. 1995 Jun;5(3):372-6. doi: 10.1016/0959-440x(95)80099-9. Curr Opin Struct Biol. 1995. PMID: 7583635 Review.
Cited by
-
Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network.Biomolecules. 2018 May 25;8(2):33. doi: 10.3390/biom8020033. Biomolecules. 2018. PMID: 29799510 Free PMC article.
-
Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning.Sci Rep. 2015 Jun 22;5:11476. doi: 10.1038/srep11476. Sci Rep. 2015. PMID: 26098304 Free PMC article.
-
Epitope Mapping of Rhi o 1 and Generation of a Hypoallergenic Variant: A CANDIDATE MOLECULE FOR FUNGAL ALLERGY VACCINES.J Biol Chem. 2016 Aug 19;291(34):18016-29. doi: 10.1074/jbc.M116.732032. Epub 2016 Jun 28. J Biol Chem. 2016. PMID: 27358405 Free PMC article.
-
In silico platform for prediction of N-, O- and C-glycosites in eukaryotic protein sequences.PLoS One. 2013 Jun 28;8(6):e67008. doi: 10.1371/journal.pone.0067008. Print 2013. PLoS One. 2013. PMID: 23840574 Free PMC article.
-
Identification of NAD interacting residues in proteins.BMC Bioinformatics. 2010 Mar 30;11:160. doi: 10.1186/1471-2105-11-160. BMC Bioinformatics. 2010. PMID: 20353553 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials