Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 6;51(D1):D384-D388.
doi: 10.1093/nar/gkac1096.

The conserved domain database in 2023

Affiliations

The conserved domain database in 2023

Jiyao Wang et al. Nucleic Acids Res. .

Abstract

NLM's conserved domain database (CDD) is a collection of protein domain and protein family models constructed as multiple sequence alignments. Its main purpose is to provide annotation for protein and translated nucleotide sequences with the location of domain footprints and associated functional sites, and to define protein domain architecture as a basis for assigning gene product names and putative/predicted function. CDD has been available publicly for over 20 years and has grown substantially during that time. Maintaining an archive of pre-computed annotation continues to be a challenge and has slowed down the cadence of CDD releases. CDD curation staff builds hierarchical classifications of large protein domain families, adds models for novel domain families via surveillance of the protein 'dark matter' that currently lacks annotation, and now spends considerable effort on providing names and attribution for conserved domain architectures. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
BATCH CD-Search results formatted for a few query sequences demonstrate the availability of domain architecture information (under the heading ‘Protein classification’), as well as transferable attributes assigned to each architecture, on top of domain footprint annotation and functional sites associated with some of the domain models. The protein classification information and site annotations can be toggled off for a sparse display focusing on domain footprints only.

References

    1. Mistry J., Chuguransky S., Williams L., Qureshi M., Salazar G.A., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J.et al. .. Pfam: the protein families database in 2021. Nucleic Acids Res. 2021; 49:D412–D419. - PMC - PubMed
    1. Letunic I., Bork P.. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018; 46:D493–D496. - PMC - PubMed
    1. Tatusov R.L., Natale D.A., Garkavtsev I.V., Tatusova T.A., Shankavaram U.T., Rao B.S., Kiryutin B., Galperin M.Y., Fedorova N.D., Koonin E.V.. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001; 29:22–28. - PMC - PubMed
    1. Haft D.H., Selengut J.D., Richter A.R., Harkins D., Basu M.K., Beck E.. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 2013; 41:D387–D395. - PMC - PubMed
    1. Klimke W., Agarwala R., Badretdin A., Chetvernin S., Ciufo S., Fedorov B., Kiryutin B., O’Neill K., Resch W., Resenchuk S.et al. .. The national center for biotechnology information's protein clusters database. Nucleic Acids Res. 2009; 37:D216–D223. - PMC - PubMed

Publication types