Predicting thermophilic proteins with pseudo amino acid composition:approached from chaos game representation and principal component analysis
- PMID: 21787282
- DOI: 10.2174/092986611797642661
Predicting thermophilic proteins with pseudo amino acid composition:approached from chaos game representation and principal component analysis
Abstract
Comprehensive knowledge of thermophilic mechanisms about some organisms whose optimum growth temperature (OGT) ranges from 50 to 80 °C degree plays a major role for helping to design stable proteins. How to predict function-unknown proteins to be thermophilic is a long but not fairly resolved problem. Chaos game representation (CGR) can investigate hidden patterns in protein sequences, and also can visually reveal their previously unknown structures. In this paper, using the general form of pseudo amino acid composition to represent protein samples, we proposed a novel method for presenting protein sequence to a CGR picture using CGR algorithm. A 24-dimensional vector extracted from these CGR segments and the first two PCA features are used to classify thermophilic and mesophilic proteins by Support Vector Machine (SVM). Our method is evaluated by the jackknife test. For the 24-dimensional vector, the accuracy is 0.8792 and Matthews Correlation Coefficient (MCC) is 0.7587. The 26-dimensional vector by hybridizing with PCA components performs highly satisfaction, in which the accuracy achieves 0.9944 and MCC achieves 0.9888. The results show the effectiveness of the new hybrid method.
Similar articles
-
A new hybrid fractal algorithm for predicting thermophilic nucleotide sequences.J Theor Biol. 2012 Jan 21;293:74-81. doi: 10.1016/j.jtbi.2011.09.028. Epub 2011 Oct 10. J Theor Biol. 2012. PMID: 22001320
-
Predicting protein solubility by the general form of Chou's pseudo amino acid composition: approached from chaos game representation and fractal dimension.Protein Pept Lett. 2012 Sep;19(9):940-8. doi: 10.2174/092986612802084492. Protein Pept Lett. 2012. PMID: 22486614
-
Prediction of thermophilic protein with pseudo amino Acid composition: an approach from combined feature selection and reduction.Protein Pept Lett. 2011 Jul;18(7):684-9. doi: 10.2174/092986611795446085. Protein Pept Lett. 2011. PMID: 21413920
-
Differences in amino acids composition and coupling patterns between mesophilic and thermophilic proteins.Amino Acids. 2008 Jan;34(1):25-33. doi: 10.1007/s00726-007-0589-x. Epub 2007 Aug 21. Amino Acids. 2008. PMID: 17710363 Review.
-
Chaos game representation and its applications in bioinformatics.Comput Struct Biotechnol J. 2021 Nov 10;19:6263-6271. doi: 10.1016/j.csbj.2021.11.008. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34900136 Free PMC article. Review.
Cited by
-
A Method for Prediction of Thermophilic Protein Based on Reduced Amino Acids and Mixed Features.Front Bioeng Biotechnol. 2020 May 5;8:285. doi: 10.3389/fbioe.2020.00285. eCollection 2020. Front Bioeng Biotechnol. 2020. PMID: 32432088 Free PMC article.
-
Accurate prediction of nuclear receptors with conjoint triad feature.BMC Bioinformatics. 2015 Dec 3;16:402. doi: 10.1186/s12859-015-0828-1. BMC Bioinformatics. 2015. PMID: 26630876 Free PMC article.
-
iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy.Front Microbiol. 2022 Feb 22;13:790063. doi: 10.3389/fmicb.2022.790063. eCollection 2022. Front Microbiol. 2022. PMID: 35273581 Free PMC article.
-
Prediction of RNA-protein interactions using conjoint triad feature and chaos game representation.Bioengineered. 2018;9(1):242-251. doi: 10.1080/21655979.2018.1470721. Bioengineered. 2018. PMID: 30117758 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous