CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database
- PMID: 20696711
- DOI: 10.1093/glycob/cwq106
CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database
Abstract
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
Similar articles
-
The carbohydrate-active enzymes database (CAZy) in 2013.Nucleic Acids Res. 2014 Jan;42(Database issue):D490-5. doi: 10.1093/nar/gkt1178. Epub 2013 Nov 21. Nucleic Acids Res. 2014. PMID: 24270786 Free PMC article.
-
MIPS: analysis and annotation of genome information in 2007.Nucleic Acids Res. 2008 Jan;36(Database issue):D196-201. doi: 10.1093/nar/gkm980. Epub 2007 Dec 23. Nucleic Acids Res. 2008. PMID: 18158298 Free PMC article.
-
Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences.BMC Bioinformatics. 2006 Jun 15;7:304. doi: 10.1186/1471-2105-7-304. BMC Bioinformatics. 2006. PMID: 16776838 Free PMC article.
-
Glycosyltransferases as biocatalysts.Curr Opin Chem Biol. 2011 Apr;15(2):226-33. doi: 10.1016/j.cbpa.2010.11.022. Epub 2011 Feb 19. Curr Opin Chem Biol. 2011. PMID: 21334964 Review.
-
Exploring genomes for glycosyltransferases.Mol Biosyst. 2010 Oct;6(10):1773-81. doi: 10.1039/c000238k. Epub 2010 Jun 17. Mol Biosyst. 2010. PMID: 20556308 Review.
Cited by
-
Molecular and biochemical analyses of the GH44 module of CbMan5B/Cel44A, a bifunctional enzyme from the hyperthermophilic bacterium Caldicellulosiruptor bescii.Appl Environ Microbiol. 2012 Oct;78(19):7048-59. doi: 10.1128/AEM.02009-12. Epub 2012 Jul 27. Appl Environ Microbiol. 2012. PMID: 22843537 Free PMC article.
-
BESC knowledgebase public portal.Bioinformatics. 2012 Mar 1;28(5):750-1. doi: 10.1093/bioinformatics/bts016. Epub 2012 Jan 11. Bioinformatics. 2012. PMID: 22238270 Free PMC article.
-
Phylogenetic distribution of potential cellulases in bacteria.Appl Environ Microbiol. 2013 Mar;79(5):1545-54. doi: 10.1128/AEM.03305-12. Epub 2012 Dec 21. Appl Environ Microbiol. 2013. PMID: 23263967 Free PMC article.
-
mRNA-Seq analysis of the Pseudoperonospora cubensis transcriptome during cucumber (Cucumis sativus L.) infection.PLoS One. 2012;7(4):e35796. doi: 10.1371/journal.pone.0035796. Epub 2012 Apr 24. PLoS One. 2012. PMID: 22545137 Free PMC article.
-
Dual-RNA-sequencing to elucidate the interactions between sorghum and Colletotrichum sublineola.Front Fungal Biol. 2024 Aug 16;5:1437344. doi: 10.3389/ffunb.2024.1437344. eCollection 2024. Front Fungal Biol. 2024. PMID: 39220294 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous