Prediction of protein structural class for the twilight zone sequences
- PMID: 17433260
- DOI: 10.1016/j.bbrc.2007.03.164
Prediction of protein structural class for the twilight zone sequences
Abstract
Structural class characterizes the overall folding type of a protein or its domain. This paper develops an accurate method for in silico prediction of structural classes from low homology (twilight zone) protein sequences. The proposed LLSC-PRED method applies linear logistic regression classifier and a custom-designed, feature-based sequence representation to provide predictions. The main advantages of the LLSC-PRED are the comprehensive representation that includes 58 features describing composition and physicochemical properties of the sequences and transparency of the prediction model. The representation also includes predicted secondary structure content, thus for the first time exploring synergy between these two related predictions. Based on tests performed with a large set of 1673 twilight zone domains, the LLSC-PRED's prediction accuracy, which equals over 62%, is shown to be better than accuracy of over a dozen recently published competing in silico methods and similar to accuracy of other, non-transparent classifiers that use the proposed representation.
Similar articles
-
PFRES: protein fold classification by using evolutionary information and predicted secondary structure.Bioinformatics. 2007 Nov 1;23(21):2843-50. doi: 10.1093/bioinformatics/btm475. Epub 2007 Oct 17. Bioinformatics. 2007. PMID: 17942446
-
Prediction of protein secondary structure content for the twilight zone sequences.Proteins. 2007 Nov 15;69(3):486-98. doi: 10.1002/prot.21527. Proteins. 2007. PMID: 17623861
-
Prediction of protein structural class using novel evolutionary collocation-based sequence representation.J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918. J Comput Chem. 2008. PMID: 18293306
-
Multiple sequence alignments.Curr Opin Struct Biol. 2005 Jun;15(3):261-6. doi: 10.1016/j.sbi.2005.04.002. Curr Opin Struct Biol. 2005. PMID: 15963889 Review.
-
Modeling of human monoamine oxidase A: from low resolution threading models to accurate comparative models based on crystal structures.Neurotoxicology. 2004 Jan;25(1-2):47-61. doi: 10.1016/S0161-813X(03)00088-3. Neurotoxicology. 2004. PMID: 14697880 Review.
Cited by
-
Using amino acid physicochemical distance transformation for fast protein remote homology detection.PLoS One. 2012;7(9):e46633. doi: 10.1371/journal.pone.0046633. Epub 2012 Sep 28. PLoS One. 2012. PMID: 23029559 Free PMC article.
-
Automatic structure classification of small proteins using random forest.BMC Bioinformatics. 2010 Jul 1;11:364. doi: 10.1186/1471-2105-11-364. BMC Bioinformatics. 2010. PMID: 20594334 Free PMC article.
-
Fold homology detection using sequence fragment composition profiles of proteins.Proteins. 2010 Oct;78(13):2745-56. doi: 10.1002/prot.22788. Proteins. 2010. PMID: 20635424 Free PMC article.
-
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021. Comput Math Methods Med. 2021. PMID: 34055035 Free PMC article.
-
A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition.BMC Bioinformatics. 2013 Jul 24;14:233. doi: 10.1186/1471-2105-14-233. BMC Bioinformatics. 2013. PMID: 23879571 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources