Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach
- PMID: 17254297
- PMCID: PMC1764469
- DOI: 10.1186/1471-2105-7-S5-S13
Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach
Abstract
Metal-binding proteins play important roles in structural stability, signaling, regulation, transport, immune response, metabolism control, and metal homeostasis. Because of their functional and sequence diversity, it is desirable to explore additional methods for predicting metal-binding proteins irrespective of sequence similarity. This work explores support vector machines (SVM) as such a method. SVM prediction systems were developed by using 53,333 metal-binding and 147,347 non-metal-binding proteins, and evaluated by an independent set of 31,448 metal-binding and 79,051 non-metal-binding proteins. The computed prediction accuracy is 86.3%, 81.6%, 83.5%, 94.0%, 81.2%, 85.4%, 77.6%, 90.4%, 90.9%, 74.9% and 78.1% for calcium-binding, cobalt-binding, copper-binding, iron-binding, magnesium-binding, manganese-binding, nickel-binding, potassium-binding, sodium-binding, zinc-binding, and all metal-binding proteins respectively. The accuracy for the non-member proteins of each class is 88.2%, 99.9%, 98.1%, 91.4%, 87.9%, 94.5%, 99.2%, 99.9%, 99.9%, 98.0%, and 88.0% respectively. Comparable accuracies were obtained by using a different SVM kernel function. Our method predicts 67% of the 87 metal-binding proteins non-homologous to any protein in the Swissprot database and 85.3% of the 333 proteins of known metal-binding domains as metal-binding. These suggest the usefulness of SVM for facilitating the prediction of metal-binding proteins. Our software can be accessed at the SVMProt server http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
Figures

Similar articles
-
Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity.J Lipid Res. 2006 Apr;47(4):824-31. doi: 10.1194/jlr.M500530-JLR200. Epub 2006 Jan 27. J Lipid Res. 2006. PMID: 16443826
-
Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach.Nucleic Acids Res. 2004 Dec 7;32(21):6437-44. doi: 10.1093/nar/gkh984. Print 2004. Nucleic Acids Res. 2004. PMID: 15585667 Free PMC article.
-
Enzyme family classification by support vector machines.Proteins. 2004 Apr 1;55(1):66-76. doi: 10.1002/prot.20045. Proteins. 2004. PMID: 14997540
-
Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity.Proteomics. 2006 Jul;6(14):4023-37. doi: 10.1002/pmic.200500938. Proteomics. 2006. PMID: 16791826 Review.
-
Homology-free prediction of functional class of proteins and peptides by support vector machines.Curr Protein Pept Sci. 2008 Feb;9(1):70-95. doi: 10.2174/138920308783565697. Curr Protein Pept Sci. 2008. PMID: 18336324 Review.
Cited by
-
Prediction of functional class of proteins and peptides irrespective of sequence homology by support vector machines.Bioinform Biol Insights. 2009 Nov 24;1:19-47. doi: 10.4137/bbi.s315. Bioinform Biol Insights. 2009. PMID: 20066123 Free PMC article.
-
Efficacy of different protein descriptors in predicting protein functional families.BMC Bioinformatics. 2007 Aug 17;8:300. doi: 10.1186/1471-2105-8-300. BMC Bioinformatics. 2007. PMID: 17705863 Free PMC article.
-
Zincbindpredict-Prediction of Zinc Binding Sites in Proteins.Molecules. 2021 Feb 12;26(4):966. doi: 10.3390/molecules26040966. Molecules. 2021. PMID: 33673040 Free PMC article.
-
An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins.PLoS One. 2012;7(11):e49716. doi: 10.1371/journal.pone.0049716. Epub 2012 Nov 14. PLoS One. 2012. PMID: 23166753 Free PMC article.
-
Oxypred: prediction and classification of oxygen-binding proteins.Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):250-2. doi: 10.1016/S1672-0229(08)60012-1. Genomics Proteomics Bioinformatics. 2007. PMID: 18267306 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources