SIMAP: the similarity matrix of proteins
- PMID: 16381858
- PMCID: PMC1347468
- DOI: 10.1093/nar/gkj106
SIMAP: the similarity matrix of proteins
Abstract
Similarity Matrix of Proteins (SIMAP) (http://mips.gsf.de/simap) provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith-Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.
Figures

Similar articles
-
SIMAP--the similarity matrix of proteins.Bioinformatics. 2005 Sep 1;21 Suppl 2:ii42-6. doi: 10.1093/bioinformatics/bti1107. Bioinformatics. 2005. PMID: 16204123
-
SIMAP--the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage.Nucleic Acids Res. 2014 Jan;42(Database issue):D279-84. doi: 10.1093/nar/gkt970. Epub 2013 Oct 27. Nucleic Acids Res. 2014. PMID: 24165881 Free PMC article.
-
SIMAP--structuring the network of protein similarities.Nucleic Acids Res. 2008 Jan;36(Database issue):D289-92. doi: 10.1093/nar/gkm963. Epub 2007 Nov 23. Nucleic Acids Res. 2008. PMID: 18037617 Free PMC article.
-
SIMAP--a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters.Nucleic Acids Res. 2010 Jan;38(Database issue):D223-6. doi: 10.1093/nar/gkp949. Epub 2009 Nov 11. Nucleic Acids Res. 2010. PMID: 19906725 Free PMC article.
-
IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.Evol Bioinform Online. 2015 Mar 12;11:35-42. doi: 10.4137/EBO.S18980. eCollection 2015. Evol Bioinform Online. 2015. PMID: 25861209 Free PMC article. Review.
Cited by
-
SuperTarget and Matador: resources for exploring drug-target relationships.Nucleic Acids Res. 2008 Jan;36(Database issue):D919-22. doi: 10.1093/nar/gkm862. Epub 2007 Oct 16. Nucleic Acids Res. 2008. PMID: 17942422 Free PMC article.
-
STRING 7--recent developments in the integration and prediction of protein interactions.Nucleic Acids Res. 2007 Jan;35(Database issue):D358-62. doi: 10.1093/nar/gkl825. Epub 2006 Nov 10. Nucleic Acids Res. 2007. PMID: 17098935 Free PMC article.
-
Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics.Environ Microbiol. 2013 May;15(5):1428-40. doi: 10.1111/j.1462-2920.2012.02836.x. Epub 2012 Jul 30. Environ Microbiol. 2013. PMID: 22845467 Free PMC article.
-
ProSAS: a database for analyzing alternative splicing in the context of protein structures.Nucleic Acids Res. 2008 Jan;36(Database issue):D63-8. doi: 10.1093/nar/gkm793. Epub 2007 Oct 11. Nucleic Acids Res. 2008. PMID: 17933774 Free PMC article.
-
Gene3D: comprehensive structural and functional annotation of genomes.Nucleic Acids Res. 2008 Jan;36(Database issue):D414-8. doi: 10.1093/nar/gkm1019. Epub 2007 Nov 21. Nucleic Acids Res. 2008. PMID: 18032434 Free PMC article.
References
-
- Altschul S.F., Gish W., Miller W., Myers G., Lipman D.J. A basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
-
- Pearson W.R. Flexible sequence similarity searching with the FASTA3 program package. Methods Mol. Biol. 2000;132:185–219. - PubMed
-
- Gojobori T., Li W.H., Graur D. Patterns of nucleotide substitution in pseudogenes and functional genes. J. Mol. Evol. 1982;18:360–369. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Molecular Biology Databases