CDD: NCBI's conserved domain database

Aron Marchler-Bauer¹, Myra K Derbyshire², Noreen R Gonzales², Shennan Lu², Farideh Chitsaz², Lewis Y Geer², Renata C Geer², Jane He², Marc Gwadz², David I Hurwitz², Christopher J Lanczycki², Fu Lu², Gabriele H Marchler², James S Song², Narmada Thanki², Zhouxi Wang², Roxanne A Yamashita², Dachuan Zhang², Chanjuan Zheng², Stephen H Bryant²

Affiliations

¹ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg. 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA bauer@ncbi.nlm.nih.gov.
² National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg. 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA.

PMID: 25414356
PMCID: PMC4383992
DOI: 10.1093/nar/gku1221

CDD: NCBI's conserved domain database

Aron Marchler-Bauer et al. Nucleic Acids Res. 2015 Jan.

. 2015 Jan;43(Database issue):D222-6.

doi: 10.1093/nar/gku1221. Epub 2014 Nov 20.

Authors

Affiliations

¹ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg. 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA bauer@ncbi.nlm.nih.gov.
² National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg. 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA.

PMID: 25414356
PMCID: PMC4383992
DOI: 10.1093/nar/gku1221

Abstract

NCBI's CDD, the Conserved Domain Database, enters its 15(th) year as a public resource for the annotation of proteins with the location of conserved domain footprints. Going forward, we strive to improve the coverage and consistency of domain annotation provided by CDD. We maintain a live search system as well as an archive of pre-computed domain annotation for sequences tracked in NCBI's Entrez protein database, which can be retrieved for single sequences or in bulk. We also maintain import procedures so that CDD contains domain models and domain definitions provided by several collections available in the public domain, as well as those produced by an in-house curation effort. The curation effort aims at increasing coverage and providing finer-grained classifications of common protein domains, for which a wealth of functional and structural data has become available. CDD curation generates alignment models of representative sequence fragments, which are in agreement with domain boundaries as observed in protein 3D structure, and which model the structurally conserved cores of domain families as well as annotate conserved features. CDD can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.

PubMed Disclaimer

Figures

**Figure 1.**
CD-Search reporting a ‘rescued’ domain annotation, which scores an E-value above the default reporting threshold of 0.01. The live search for the query sequence, derived from the PDB structure 2WOZ.

**Figure 2.**
CD-Search results for SwissProt Q6XUD6, zoomed in to ‘residue level’ display so that the precise locations of domain boundaries and functional sites become apparent. Query sequence residues highlighted in bold print have been identified as part of a functional site (such as the ‘catalytic site’ mapping to R118 and D151, plus other residues not shown in this example). Structural motifs are shown as double-headed arrows.

See this image and copyright information in PMC

References

1. Finn R.D., Bateman A., Clements J., Coggill P.C., Eberhardt R.Y., Eddy S.R., Heger A., Hetherington K., Holm L., Mistry J., et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:D222–D230. - PMC - PubMed
1. Letunic I., Doerks T., Bork P. SMART: recent updates, new developments, and status in 2015. Nucleic Acids Res. 2014 doi:10.1093/nar/gku949. - PMC - PubMed
1. Tatusov R.L., Natale D.A., Garkavtsev I.V., Tatusova T.A., Shankavaram U.T., Rao B.S., Kiryutin B., Galperin M.Y., Fedorova N.D., Koonin E.V. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–28. - PMC - PubMed
1. Haft D.H., Selengut J.D., Richter A.R., Harkins D., Basu M.K., Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41:D387–D395. - PMC - PubMed
1. Klimke W., Agarwala R., Badretdin A., Chetvernin S., Ciufo S., Fedorov B., Kiryutin B., O'Neill K., Resch W., Resenchuk S., et al. The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res. 2009;37:D216–D223. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

CDD: NCBI's conserved domain database

Affiliations

CDD: NCBI's conserved domain database

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources