Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb;4(3):207-12.
doi: 10.1186/1479-7364-4-3-207.

The CATH database

Affiliations

The CATH database

Michael Knudsen et al. Hum Genomics. 2010 Feb.

Abstract

The CATH database provides hierarchical classification of protein domains based on their folding patterns. Domains are obtained from protein structures deposited in the Protein Data Bank and both domain identification and subsequent classification use manual as well as automated procedures. The accompanying website (www.cathdb.info) provides an easy-to-use entry to the classification, allowing for both browsing and downloading of data. Here, we give a brief review of the database, its corresponding website and some related tools.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Screenshot of the domain 3cx5B01 in the CATH browser. The domain is classified as 3.30.830.10, which means that it belongs to the Mixed Alpha-Beta class (C = 3), the 2-Layer Sandwich architecture (A = 30) and so forth. The CATH Code column allows for easy browsing both up and down levels in the hierarchy, and the Links column provides links to relevant entries in the Gene3D database. An XML file containing all information on the page can be downloaded by clicking on the XML link next to the domain name. The icon below the image links to a structure file in the Rasmol format. The panes in the bottom of the screen provide additional information about the domain. The content of the Structure pane, which contains secondary structure information, is shown in the figure. The Sequence pane contains the amino acid sequence of the domain and the History pane contains the history of the domain in CATH, with information about when the domain was added to the database and whether its classification has changed over time.
Figure 2
Figure 2
View of the Alpha/alpha barrel architecture (CATH classification 1.50) on the CATH website. The Classification Lineage shows the selected architecture is placed in the CATH hierarchy, and the Summary of Child Nodes gives the number of nodes further down. The selected architecture comprises two topologies, 1.50.10 and 1.50.30, shown in the Summary pane below. Direct links to the CATH pages corresponding to the topologies, as well as links to representative domains, are available alongside the topology names. By clicking a link to a representative domain, an output as in Figure 1 is obtained. Navigation on all levels of the CATH hierarchy is facilitated by an analogous page layout.
Figure 3
Figure 3
Procedure for chopping protein chains into domains. From the input, domain boundaries are predicted using various algorithms like ProteinDBS and CATHEDRAL. If the methods agree to a certain extent, or if the putative domains are matched by domains already in CATH, the domains are automatically determined. Otherwise, manual inspection is needed. This is a simplified version of complete flow chart from Greene et al.[15].
Figure 4
Figure 4
The distribution of topology sizes in the most recent version of CATH (version 3.2.0) resembles a power law. A few topologies, so-called superfolds, contain a disproportionate number of structures. The largest topology, the Rossmann fold (3.40.50), comprises 14,720 structures, whereas III topologies have one member only.

References

    1. Berman HM, Westbrook J, Feng Z, Gilliland G. et al.The Protein Data Bank. Nucl Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. - DOI - PMC - PubMed
    1. Berman HM, Battistuz T, Bhat TN, Bluhm WF. et al.The Protein Data Bank. Acta Cryst. 2002;D58:899–907. - PubMed
    1. Orengo CA, Michie AD, Jones DT, Swindells MB. et al.CATH: A hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/S0969-2126(97)00260-8. - DOI - PubMed
    1. Orengo CA, Martin AM, Hutchinson G, Jones S. et al.Classifying a protein in the CATH database of domain structures. Acta Cryst. 1998;D54:1155–1167. - PubMed
    1. Orengo CA, Jones DT, Taylor W, Thornton JM. Protein superfamilies and domain superfolds. Nature. 1994;372:631–634. doi: 10.1038/372631a0. - DOI - PubMed

LinkOut - more resources