Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition
- PMID: 15314209
- DOI: 10.1093/protein/gzh061
Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition
Abstract
Membrane proteins are generally classified into the following five types: (1) type I membrane proteins, (2) type II membrane proteins, (3) multipass transmembrane proteins, (4) lipid chain-anchored membrane proteins and (5) GPI-anchored membrane proteins. Prediction of membrane protein types has become one of the growing hot topics in bioinformatics. Currently, we are facing two critical challenges in this area: first, how to take into account the extremely complicated sequence-order effects, and second, how to deal with the highly uneven sizes of the subsets in a training dataset. In this paper, stimulated by the concept of using the pseudo-amino acid composition to incorporate the sequence-order effects, the spectral analysis technique is introduced to represent the statistical sample of a protein. Based on such a framework, the weighted support vector machine (SVM) algorithm is applied. The new approach has remarkable power in dealing with the bias caused by the situation when one subset in the training dataset contains many more samples than the other. The new method is particularly useful when our focus is aimed at proteins belonging to small subsets. The results obtained by the self-consistency test, jackknife test and independent dataset test are encouraging, indicating that the current approach may serve as a powerful complementary tool to other existing methods for predicting the types of membrane proteins.
Similar articles
-
Using pseudo amino acid composition and binary-tree support vector machines to predict protein structural classes.Amino Acids. 2007 Nov;33(4):623-9. doi: 10.1007/s00726-007-0496-1. Epub 2007 Feb 19. Amino Acids. 2007. PMID: 17308864
-
Support vector machines for predicting membrane protein types by using functional domain composition.Biophys J. 2003 May;84(5):3257-63. doi: 10.1016/S0006-3495(03)70050-2. Biophys J. 2003. PMID: 12719255 Free PMC article.
-
SLLE for predicting membrane protein types.J Theor Biol. 2005 Jan 7;232(1):7-15. doi: 10.1016/j.jtbi.2004.07.023. J Theor Biol. 2005. PMID: 15498588
-
A Treatise to Computational Approaches Towards Prediction of Membrane Protein and Its Subtypes.J Membr Biol. 2017 Feb;250(1):55-76. doi: 10.1007/s00232-016-9937-7. Epub 2016 Nov 19. J Membr Biol. 2017. PMID: 27866233 Review.
-
Bioinformatics approaches for functional annotation of membrane proteins.Brief Bioinform. 2014 Mar;15(2):155-68. doi: 10.1093/bib/bbt015. Epub 2013 Mar 23. Brief Bioinform. 2014. PMID: 23524979 Review.
Cited by
-
Machine learning for in silico virtual screening and chemical genomics: new strategies.Comb Chem High Throughput Screen. 2008 Sep;11(8):677-85. doi: 10.2174/138620708785739899. Comb Chem High Throughput Screen. 2008. PMID: 18795887 Free PMC article. Review.
-
Predicting drug-target interaction networks based on functional groups and biological features.PLoS One. 2010 Mar 11;5(3):e9603. doi: 10.1371/journal.pone.0009603. PLoS One. 2010. PMID: 20300175 Free PMC article.
-
Using cellular automata images and pseudo amino acid composition to predict protein subcellular location.Amino Acids. 2006 Feb;30(1):49-54. doi: 10.1007/s00726-005-0225-6. Epub 2005 Jul 28. Amino Acids. 2006. PMID: 16044193 Free PMC article.
-
Using amino acid physicochemical distance transformation for fast protein remote homology detection.PLoS One. 2012;7(9):e46633. doi: 10.1371/journal.pone.0046633. Epub 2012 Sep 28. PLoS One. 2012. PMID: 23029559 Free PMC article.
-
pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties.BMC Bioinformatics. 2005 Jun 17;6:152. doi: 10.1186/1471-2105-6-152. BMC Bioinformatics. 2005. PMID: 15963230 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources