A fast indexing approach for protein structure comparison
- PMID: 20122220
- PMCID: PMC3724480
- DOI: 10.1186/1471-2105-11-S1-S46
A fast indexing approach for protein structure comparison
Abstract
Background: Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly compare protein structures, whilst maintaining high matching accuracy.
Results: We have developed IR Tableau, a fast protein comparison algorithm, which leverages the tableau representation to compare protein tertiary structures. IR tableau compares tableaux using information retrieval style feature indexing techniques. Experimental analysis on the ASTRAL SCOP protein structural domain database demonstrates that IR Tableau achieves two orders of magnitude speedup over the search times of existing methods, while producing search results of comparable accuracy.
Conclusion: We show that it is possible to obtain very significant speedups for the protein structure comparison problem, by employing an information retrieval style approach for indexing proteins. The comparison accuracy achieved is also strong, thus opening the way for large scale processing of very large protein structure databases.
Figures







Similar articles
-
Fast and accurate protein substructure searching with simulated annealing and GPUs.BMC Bioinformatics. 2010 Sep 3;11:446. doi: 10.1186/1471-2105-11-446. BMC Bioinformatics. 2010. PMID: 20813068 Free PMC article.
-
Contact patterns between helices and strands of sheet define protein folding patterns.Proteins. 2007 Mar 1;66(4):869-76. doi: 10.1002/prot.21241. Proteins. 2007. PMID: 17206659
-
Tableau-based protein substructure search using quadratic programming.BMC Bioinformatics. 2009 May 19;10:153. doi: 10.1186/1471-2105-10-153. BMC Bioinformatics. 2009. PMID: 19450287 Free PMC article.
-
Cataloging topologies of protein folding patterns.J Mol Recognit. 2010 Mar-Apr;23(2):253-7. doi: 10.1002/jmr.1006. J Mol Recognit. 2010. PMID: 20151416 Review.
-
Rapid retrieval of protein structures from databases.Drug Discov Today. 2007 Sep;12(17-18):732-9. doi: 10.1016/j.drudis.2007.07.014. Epub 2007 Aug 28. Drug Discov Today. 2007. PMID: 17826686 Review.
Cited by
-
Fast and accurate protein substructure searching with simulated annealing and GPUs.BMC Bioinformatics. 2010 Sep 3;11:446. doi: 10.1186/1471-2105-11-446. BMC Bioinformatics. 2010. PMID: 20813068 Free PMC article.
-
Multiple graph regularized protein domain ranking.BMC Bioinformatics. 2012 Nov 19;13:307. doi: 10.1186/1471-2105-13-307. BMC Bioinformatics. 2012. PMID: 23157331 Free PMC article.
-
ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2105-13-S7-S2. BMC Bioinformatics. 2012. PMID: 22594999 Free PMC article.
-
RUPEE: A fast and accurate purely geometric protein structure search.PLoS One. 2019 Mar 15;14(3):e0213712. doi: 10.1371/journal.pone.0213712. eCollection 2019. PLoS One. 2019. PMID: 30875409 Free PMC article.
References
-
- Lesk A. Bioinformatics. Oxford University Press; 2002.
-
- Holm L, Sander C. Mapping the protein universe. Science (New York, NY) 1996;11(5275):595–603. [PMID: 8662544]. - PubMed
-
- Orengo CA, Taylor WR. SSAP: sequential structure alignment program for protein structure comparison. Methods in Enzymology. 1996;11:617–635. full_text. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources