Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity
- PMID: 16791826
- DOI: 10.1002/pmic.200500938
Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity
Abstract
Protein sequence contains clues to its function. Functional prediction from sequence presents a challenge particularly for proteins that have low or no sequence similarity to proteins of known function. Recently, machine learning methods have been explored for predicting functional class of proteins from sequence-derived properties independent of sequence similarity, which showed promising potential for low- and non-homologous proteins. These methods can thus be explored as potential tools to complement alignment- and clustering-based methods for predicting protein function. This article reviews the strategies, current progresses, and underlying difficulties in using machine learning methods for predicting the functional class of proteins. The relevant software and web-servers are described. The reported prediction performances in the application of these methods are also presented, which need to be interpreted with caution as they are dependent on such factors as datasets used and choice of parameters.
Similar articles
-
A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.In Silico Biol. 2008;8(2):129-40. In Silico Biol. 2008. PMID: 18928201
-
Global sequence properties for superfamily prediction: a machine learning approach.J Integr Bioinform. 2009 Aug 23;6(1):109. doi: 10.2390/biecoll-jib-2009-109. J Integr Bioinform. 2009. PMID: 20134076
-
Predicting protein secondary structure by a support vector machine based on a new coding scheme.Genome Inform. 2004;15(2):181-90. Genome Inform. 2004. PMID: 15706504
-
Protein secondary structure prediction.Methods Mol Biol. 2010;609:327-48. doi: 10.1007/978-1-60327-241-4_19. Methods Mol Biol. 2010. PMID: 20221928 Review.
-
Protein function prediction with high-throughput data.Amino Acids. 2008 Oct;35(3):517-30. doi: 10.1007/s00726-008-0077-y. Epub 2008 Apr 22. Amino Acids. 2008. PMID: 18427717 Review.
Cited by
-
Predicting protein function by machine learning on amino acid sequences--a critical evaluation.BMC Genomics. 2007 Mar 20;8:78. doi: 10.1186/1471-2164-8-78. BMC Genomics. 2007. PMID: 17374164 Free PMC article.
-
Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor.Front Microbiol. 2023 Feb 13;14:1130594. doi: 10.3389/fmicb.2023.1130594. eCollection 2023. Front Microbiol. 2023. PMID: 36860491 Free PMC article.
-
Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis.BMC Bioinformatics. 2007 May 22;8:164. doi: 10.1186/1471-2105-8-164. BMC Bioinformatics. 2007. PMID: 17518996 Free PMC article.
-
Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.Nucleic Acids Res. 2011 Jul;39(Web Server issue):W385-90. doi: 10.1093/nar/gkr284. Epub 2011 May 23. Nucleic Acids Res. 2011. PMID: 21609959 Free PMC article.
-
The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction.Evol Bioinform Online. 2021 Dec 3;17:11769343211062608. doi: 10.1177/11769343211062608. eCollection 2021. Evol Bioinform Online. 2021. PMID: 34880594 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources