Code optimization of the subroutine to remove near identical matches in the sequence database homology search tool PSI-BLAST
- PMID: 20583927
- DOI: 10.1089/cmb.2008.0053
Code optimization of the subroutine to remove near identical matches in the sequence database homology search tool PSI-BLAST
Abstract
A central task in protein sequence characterization is the use of a sequence database homology search tool to find similar protein sequences in other individuals or species. PSI-BLAST is a widely used module of the BLAST package that calculates a position-specific score matrix from the best matching sequences and performs iterated searches using a method to avoid many similar sequences for the score. For some queries and parameter settings, PSI-BLAST may find many similar high-scoring matches, and therefore up to 80% of the total run time may be spent in this procedure. In this article, we present code optimizations that improve the cache utilization and the overall performance of this procedure. Measurements show that, for queries where the number of similar matches is high, the optimized PSI-BLAST program may be as much as 2.9 times faster than the original program.
Similar articles
-
Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: validation.J Mol Recognit. 2005 Mar-Apr;18(2):139-49. doi: 10.1002/jmr.721. J Mol Recognit. 2005. PMID: 15558595
-
Identification of new claudin family members by a novel PSI-BLAST based approach with enhanced specificity.Proteins. 2006 Dec 1;65(4):808-15. doi: 10.1002/prot.21218. Proteins. 2006. PMID: 17022085
-
Database similarity searches.Methods Mol Biol. 2008;484:361-78. doi: 10.1007/978-1-59745-398-1_24. Methods Mol Biol. 2008. PMID: 18592192
-
Identifying remote protein homologs by network propagation.FEBS J. 2005 Oct;272(20):5119-28. doi: 10.1111/j.1742-4658.2005.04947.x. FEBS J. 2005. PMID: 16218946 Review.
-
Practical and predictive bioinformatics methods for the identification of potentially cross-reactive protein matches.Mol Nutr Food Res. 2006 Jul;50(7):655-60. doi: 10.1002/mnfr.200500277. Mol Nutr Food Res. 2006. PMID: 16810734 Review.
Cited by
-
Simple adjustment of the sequence weight algorithm remarkably enhances PSI-BLAST performance.BMC Bioinformatics. 2017 Jun 2;18(1):288. doi: 10.1186/s12859-017-1686-9. BMC Bioinformatics. 2017. PMID: 28578660 Free PMC article.
-
Div-BLAST: diversification of sequence search results.PLoS One. 2014 Dec 22;9(12):e115445. doi: 10.1371/journal.pone.0115445. eCollection 2014. PLoS One. 2014. PMID: 25531115 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials