The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis
- PMID: 15608188
- PMCID: PMC539978
- DOI: 10.1093/nar/gki024
The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis
Abstract
The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43,229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616,470 domain sequences classified into 23,876 sequence families. This results in the significant expansion of the CATH HMM model library to include models built from the CATH sequence relatives, giving a 10% increase in coverage for detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned.
Figures


Similar articles
-
Identification and distribution of protein families in 120 completed genomes using Gene3D.Proteins. 2005 May 15;59(3):603-15. doi: 10.1002/prot.20409. Proteins. 2005. PMID: 15768405
-
A rapid classification protocol for the CATH Domain Database to support structural genomics.Nucleic Acids Res. 2001 Jan 1;29(1):223-7. doi: 10.1093/nar/29.1.223. Nucleic Acids Res. 2001. PMID: 11125098 Free PMC article.
-
The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution.Nucleic Acids Res. 2007 Jan;35(Database issue):D291-7. doi: 10.1093/nar/gkl959. Epub 2006 Nov 29. Nucleic Acids Res. 2007. PMID: 17135200 Free PMC article.
-
The history of the CATH structural classification of protein domains.Biochimie. 2015 Dec;119:209-17. doi: 10.1016/j.biochi.2015.08.004. Epub 2015 Aug 4. Biochimie. 2015. PMID: 26253692 Free PMC article. Review.
-
Protein function annotation using protein domain family resources.Methods. 2016 Jan 15;93:24-34. doi: 10.1016/j.ymeth.2015.09.029. Epub 2015 Oct 3. Methods. 2016. PMID: 26434392 Review.
Cited by
-
Comparison of molecular dynamics and superfamily spaces of protein domain deformation.BMC Struct Biol. 2009 Feb 17;9:6. doi: 10.1186/1472-6807-9-6. BMC Struct Biol. 2009. PMID: 19220918 Free PMC article.
-
An automatic method for assessing structural importance of amino acid positions.BMC Struct Biol. 2009 Mar 4;9:10. doi: 10.1186/1472-6807-9-10. BMC Struct Biol. 2009. PMID: 19261183 Free PMC article.
-
The Vein Patterning 1 (VEP1) gene family laterally spread through an ecological network.PLoS One. 2011;6(7):e22279. doi: 10.1371/journal.pone.0022279. Epub 2011 Jul 26. PLoS One. 2011. PMID: 21818306 Free PMC article.
-
Docking protein domains in contact space.BMC Bioinformatics. 2006 Jun 21;7:310. doi: 10.1186/1471-2105-7-310. BMC Bioinformatics. 2006. PMID: 16790041 Free PMC article.
-
ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information.BMC Bioinformatics. 2007 Oct 26;8:416. doi: 10.1186/1471-2105-8-416. BMC Bioinformatics. 2007. PMID: 17963510 Free PMC article.
References
-
- Bray J.E., Todd,A.E., Pearl,F.M., Thornton,J.M. and Orengo,C.A. (2000) The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. Protein Eng., 13, 153–165. - PubMed
-
- Taylor W. and Orengo,C. (1989) Protein structure alignment. J. Mol. Biol., 208, 1–22. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources