The predictive power of the CluSTr database

Robert Petryszak¹, Ernst Kretschmann, Daniela Wieser, Rolf Apweiler

Affiliations

PMID: 15961444
DOI: 10.1093/bioinformatics/bti542

The predictive power of the CluSTr database

Robert Petryszak et al. Bioinformatics. 2005.

. 2005 Sep 15;21(18):3604-9.

doi: 10.1093/bioinformatics/bti542. Epub 2005 Jun 16.

Authors

Robert Petryszak¹, Ernst Kretschmann, Daniela Wieser, Rolf Apweiler

Affiliation

¹ EMBL Outstation Hinxton, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

PMID: 15961444
DOI: 10.1093/bioinformatics/bti542

Abstract

Summary: The CluSTr database employs a fully automatic single-linkage hierarchical clustering method based on a similarity matrix. In order to compute the matrix, first all-against-all pair-wise comparisons between protein sequences are computed using the Smith-Waterman algorithm. The statistical significance of the similarity scores is then assessed using a Monte Carlo analysis, yielding Z-values, which are used to populate the matrix. This paper describes automated annotation experiments that quantify the predictive power and hence the biological relevance of the CluSTr data. The experiments utilized the UniProt data-mining framework to derive annotation predictions using combinations of InterPro and CluSTr. We show that this combination of data sources greatly increases the precision of predictions made by the data-mining framework, compared with the use of InterPro data alone. We conclude that the CluSTr approach to clustering proteins makes a valuable contribution to traditional protein classifications.

Availability: http://www.ebi.ac.uk/clustr/.

PubMed Disclaimer

Cited by

Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores.
Bastien O, Maréchal E. Bastien O, et al. BMC Bioinformatics. 2008 Aug 7;9:332. doi: 10.1186/1471-2105-9-332. BMC Bioinformatics. 2008. PMID: 18687111 Free PMC article.
GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains.
Lee DA, Rentzsch R, Orengo C. Lee DA, et al. Nucleic Acids Res. 2010 Jan;38(3):720-37. doi: 10.1093/nar/gkp1049. Epub 2009 Nov 18. Nucleic Acids Res. 2010. PMID: 19923231 Free PMC article.
New developments in the InterPro database.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. Mulder NJ, et al. Nucleic Acids Res. 2007 Jan;35(Database issue):D224-8. doi: 10.1093/nar/gkl841. Nucleic Acids Res. 2007. PMID: 17202162 Free PMC article.
Ortholog identification in the presence of domain architecture rearrangement.
Sjölander K, Datta RS, Shen Y, Shoffner GM. Sjölander K, et al. Brief Bioinform. 2011 Sep;12(5):413-22. doi: 10.1093/bib/bbr036. Epub 2011 Jun 28. Brief Bioinform. 2011. PMID: 21712343 Free PMC article. Review.
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
Birkholtz LM, Bastien O, Wells G, Grando D, Joubert F, Kasam V, Zimmermann M, Ortet P, Jacq N, Saïdani N, Roy S, Hofmann-Apitius M, Breton V, Louw AI, Maréchal E. Birkholtz LM, et al. Malar J. 2006 Nov 17;5:110. doi: 10.1186/1475-2875-5-110. Malar J. 2006. PMID: 17112376 Free PMC article. Review.

See all "Cited by" articles

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

1 U01 HG02712-01/HG/NHGRI NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The predictive power of the CluSTr database

Affiliation

The predictive power of the CluSTr database

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources