PSIST: indexing protein structures using suffix trees
- PMID: 16447979
- DOI: 10.1109/csb.2005.46
PSIST: indexing protein structures using suffix trees
Abstract
Approaches for indexing proteins, and for fast and scalable searching for structures similar to a query structure have important applications such as protein structure and function prediction, protein classification and drug discovery. In this paper, we developed a new method for extracting the local feature vectors of protein structures. Each residue is represented by a triangle, and the correlation between a set of residues is described by the distances between Calpha atoms and the angles between the normals of planes in which the triangles lie. The normalized local feature vectors are indexed using a suffix tree. For all query segments, suffix trees can be used effectively to retrieve the maximal matches, which are then chained to obtain alignments with database proteins. Similar proteins are selected by their alignment score against the query. Our results shows classification accuracy up to 97.8% and 99.4% at the superfamily and class level according to the SCOP classification, and shows that on average 7.49 out of 10 proteins from the same superfamily are obtained among the top 10 matches. These results are competitive with the best previous methods.
Similar articles
-
Indexing protein structures using suffix trees.Methods Mol Biol. 2008;413:147-69. doi: 10.1007/978-1-59745-574-9_6. Methods Mol Biol. 2008. PMID: 18075165
-
PSI: indexing protein structures for fast similarity search.Bioinformatics. 2003;19 Suppl 1:i81-3. doi: 10.1093/bioinformatics/btg1009. Bioinformatics. 2003. PMID: 12855441
-
Using Dali for structural comparison of proteins.Curr Protoc Bioinformatics. 2006 Jul;Chapter 5:Unit 5.5. doi: 10.1002/0471250953.bi0505s14. Curr Protoc Bioinformatics. 2006. PMID: 18428766
-
Rapid retrieval of protein structures from databases.Drug Discov Today. 2007 Sep;12(17-18):732-9. doi: 10.1016/j.drudis.2007.07.014. Epub 2007 Aug 28. Drug Discov Today. 2007. PMID: 17826686 Review.
-
Bioinformatics methods to predict protein structure and function. A practical approach.Mol Biotechnol. 2003 Feb;23(2):139-66. doi: 10.1385/MB:23:2:139. Mol Biotechnol. 2003. PMID: 12632698 Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources