Towards index-based similarity search for protein structure databases
- PMID: 16452789
Towards index-based similarity search for protein structure databases
Abstract
We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique considers the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.
Similar articles
-
Index-based similarity search for protein structure databases.J Bioinform Comput Biol. 2004 Mar;2(1):99-126. doi: 10.1142/s0219720004000491. J Bioinform Comput Biol. 2004. PMID: 15272435
-
PSI: indexing protein structures for fast similarity search.Bioinformatics. 2003;19 Suppl 1:i81-3. doi: 10.1093/bioinformatics/btg1009. Bioinformatics. 2003. PMID: 12855441
-
Accelerating approximate subsequence search on large protein sequence databases.Proc IEEE Comput Soc Bioinform Conf. 2002;1:207-16. Proc IEEE Comput Soc Bioinform Conf. 2002. PMID: 15838137
-
Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.Nat Methods. 2004 Dec;1(3):195-202. doi: 10.1038/nmeth725. Nat Methods. 2004. PMID: 15789030 Review.
-
Rapid retrieval of protein structures from databases.Drug Discov Today. 2007 Sep;12(17-18):732-9. doi: 10.1016/j.drudis.2007.07.014. Epub 2007 Aug 28. Drug Discov Today. 2007. PMID: 17826686 Review.
Cited by
-
Alignment-free local structural search by writhe decomposition.Bioinformatics. 2010 May 1;26(9):1176-84. doi: 10.1093/bioinformatics/btq127. Epub 2010 Apr 5. Bioinformatics. 2010. PMID: 20371498 Free PMC article.
-
Exploring protein structural dissimilarity to facilitate structure classification.BMC Struct Biol. 2009 Sep 19;9:60. doi: 10.1186/1472-6807-9-60. BMC Struct Biol. 2009. PMID: 19765314 Free PMC article.