Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014:2014:236717.
doi: 10.1155/2014/236717. Epub 2014 Jun 15.

An empirical study of different approaches for protein classification

Affiliations

An empirical study of different approaches for protein classification

Loris Nanni et al. ScientificWorldJournal. 2014.

Abstract

Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schema of the proposed method.
Figure 2
Figure 2
DM images extracted from 2 sample proteins of the DNA dataset.

Similar articles

Cited by

References

    1. Wang J, Li Y, Wang Q, et al. ProClusEnsem: predicting membrane protein types by fusing different models of pseudo amino acid composition. Computers in Biology and Medicine. 2012;42(5):564–574. - PubMed
    1. Chou K-C. Some remarks on protein attribute prediction and pseudo amino acid composition. Journal of Theoretical Biology. 2011;273(1):236–247. - PMC - PubMed
    1. Chou K-C, Shen H-B. MemType-2L: a Web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochemical and Biophysical Research Communications. 2007;360(2):339–345. - PubMed
    1. Chou K-C, Shen H-B. Recent progress in protein subcellular location prediction. Analytical Biochemistry. 2007;370(1):1–16. - PubMed
    1. Chou K-C, Shen H-B. Signal-CF: a subsite-coupled and window-fusing approach for predicting signal peptides. Biochemical and Biophysical Research Communications. 2007;357(3):633–640. - PubMed

LinkOut - more resources