A novel and efficient technique for identification and classification of GPCRs
- PMID: 18632334
- DOI: 10.1109/TITB.2007.911308
A novel and efficient technique for identification and classification of GPCRs
Abstract
G-protein coupled receptors (GPCRs) play a vital role in different biological processes, such as regulation of growth, death, and metabolism of cells. GPCRs are the focus of significant amount of current pharmaceutical research since they interact with more than 50% of prescription drugs. The dipeptide-based support vector machine (SVM) approach is the most accurate technique to identify and classify the GPCRs. However, this approach has two major disadvantages. First, the dimension of dipeptide-based feature vector is equal to 400. The large dimension makes the classification task computationally and memory wise inefficient. Second, it does not consider the biological properties of protein sequence for identification and classification of GPCRs. In this paper, we present a novel-feature-based SVM classification technique. The novel features are derived by applying wavelet-based time series analysis approach on protein sequences. The proposed feature space summarizes the variance information of seven important biological properties of amino acids in a protein sequence. In addition, the dimension of the feature vector for proposed technique is equal to 35. Experiments were performed on GPCRs protein sequences available at GPCRs Database. Our approach achieves an accuracy of 99.9%, 98.06%, 97.78%, and 94.08% for GPCR superfamily, families, subfamilies, and subsubfamilies (amine group), respectively, when evaluated using fivefold cross-validation. Further, an accuracy of 99.8%, 97.26%, and 97.84% was obtained when evaluated on unseen or recall datasets of GPCR superfamily, families, and subfamilies, respectively. Comparison with dipeptide-based SVM technique shows the effectiveness of our approach.
Similar articles
-
Classification of G-protein coupled receptors at four levels.Protein Eng Des Sel. 2006 Nov;19(11):511-6. doi: 10.1093/protein/gzl038. Epub 2006 Oct 10. Protein Eng Des Sel. 2006. PMID: 17032692
-
GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble.Amino Acids. 2012 May;42(5):1809-23. doi: 10.1007/s00726-011-0902-6. Epub 2011 Apr 20. Amino Acids. 2012. PMID: 21505826
-
Prediction of G-protein-coupled receptor classes based on the concept of Chou's pseudo amino acid composition: an approach from discrete wavelet transform.Anal Biochem. 2009 Jul 1;390(1):68-73. doi: 10.1016/j.ab.2009.04.009. Epub 2009 Apr 11. Anal Biochem. 2009. PMID: 19364489
-
Proteomic applications of automated GPCR classification.Proteomics. 2007 Aug;7(16):2800-14. doi: 10.1002/pmic.200700093. Proteomics. 2007. PMID: 17639603 Review.
-
Support vector machine applications in bioinformatics.Appl Bioinformatics. 2003;2(2):67-77. Appl Bioinformatics. 2003. PMID: 15130823 Review.
Cited by
-
An improved classification of G-protein-coupled receptors using sequence-derived features.BMC Bioinformatics. 2010 Aug 9;11:420. doi: 10.1186/1471-2105-11-420. BMC Bioinformatics. 2010. PMID: 20696050 Free PMC article.
-
The repertoire of G protein-coupled receptors in the human parasite Schistosoma mansoni and the model organism Schmidtea mediterranea.BMC Genomics. 2011 Dec 6;12:596. doi: 10.1186/1471-2164-12-596. BMC Genomics. 2011. PMID: 22145649 Free PMC article.
-
Classification of G-protein coupled receptors based on support vector machine with maximum relevance minimum redundancy and genetic algorithm.BMC Bioinformatics. 2010 Jun 16;11:325. doi: 10.1186/1471-2105-11-325. BMC Bioinformatics. 2010. PMID: 20550715 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources