Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Jul 7:2025.07.03.663025.
doi: 10.1101/2025.07.03.663025.

DrugDomain 2.0: comprehensive database of protein domains-ligands/drugs interactions across the whole Protein Data Bank

Affiliations

DrugDomain 2.0: comprehensive database of protein domains-ligands/drugs interactions across the whole Protein Data Bank

Kirill E Medvedev et al. bioRxiv. .

Abstract

Proteins carry out essential cellular functions - signaling, metabolism, transport - through the specific interaction of small molecules and drugs within their three-dimensional structural domains. Protein domains are conserved folding units that, when combined, drive evolutionary progress. The Evolutionary Classification Of protein Domains (ECOD) places domains into a hierarchy explicitly built around distant evolutionary relationships, enabling the detection of remote homologs across the proteomes. Yet no single resource has systematically mapped domain-ligand interactions at the structural level. To fill this gap, we introduce DrugDomain v2.0, the first comprehensive database linking evolutionary domain classifications (ECOD) to ligand binding events across the entire Protein Data Bank. We also leverage AI-driven predictions from AlphaFold to extend domain-ligand annotations to human drug targets lacking experimental structures. DrugDomain v2.0 catalogs interactions with over 37,000 PDB ligands and 7,560 DrugBank molecules, integrates 6,000+ small-molecule-associated post-translational modifications, and provides context for 14,000+ PTM-modified human protein models featuring docked ligands. The database encompasses 43,023 unique UniProt accessions and 174,545 PDB structures. The DrugDomain data is available online: https://drugdomain.cs.ucf.edu/ and https://github.com/kirmedvedev/DrugDomain.

Keywords: Database; Drug discovery; Drugs; Protein domains; Protein-drug interaction; Small molecules.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare that there are no competing interests associated with the manuscript.

Figures

Figure 1.
Figure 1.
DrugDomain database v2.0 data types and statistics.
Figure 2.
Figure 2.. DrugDomain v2.0 statistics.
(A) Taxonomic distribution of proteins reported in the DrugDomain database, by UniProt population. The inside pie shows the distribution of superkingdoms, and the outside donut shows the distribution phyla. (B) Distribution of ECOD domains from experimentally determined PDB structures, interacting with ligand, stratified by architecture (inside pie) and homologous group (outside donut).
Figure 3.
Figure 3.. ECOD A-groups (left column) of experimental PDB structures and superclasses of organic molecules according to ClassyFire classification (right column).
Each superclass and the lines pointed toward it are denoted by separate color. The thickness of the lines shows the number of PDB ligands interacting with domains from ECOD A-groups.
Figure 4.
Figure 4.. Ligand-interacting statistics by number of domains per UniProt accession in Protein Data Bank.
The left column shows the number of ligand-interacting domains, the right column shows the superclasses of organic molecules according to ClassyFire classification. The thickness of the lines shows the number of UniProt accessions.
Figure 5.
Figure 5.. Structure of the human mitochondrial Mrs2 channel (PDB: 8IP5).
(A) Channel view of Mrs2 with protein colored by ECOD domains, Mg ion is shown in green, and sticks show interacting residues. (B) Close-up channel view of Mrs2. (C) Side view of Mrs2 showing three out of five monomers. Two chains are colored by ECOD domains, one – by rainbow from blue (N-terminal part) to red (C-terminal part).
Figure 6.
Figure 6.. Examples of small molecules coordinated by pore structures.
(A) Pore view of the c-ring of mammalian F-type ATP synthase (PDB: 6TT7). Phosphatidyl serine is colored in magenta. Protein structure is colored by ECOD domains. (B) Side view of the c-ring. Two chains were removed. (C) Pore view of the HIV-1 Gag polyprotein (PDB: 7R7P) with inhibitor Bevirimat (DrugBank: DB06581) colored in magenta. Protein structure is colored by ECOD domains (D) Side view of the HIV-1 Gag polyprotein. Two chains were removed.

Similar articles

References

    1. Grishin NV (2001) Fold change in evolution of protein structures. J Struct Biol 134(2–3):167–85. - PubMed
    1. Bashton M, Chothia C (2007) The generation of new protein functions by the combination of domains. Structure 15(1):85–99. - PubMed
    1. Andreeva A, Kulesha E, Gough J, Murzin AG (2020) The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures. Nucleic Acids Res 48(D1):D376–D82. - PMC - PubMed
    1. Waman VP, Bordin N, Alcraft R, Vickerstaff R, Rauer C, Chan Q, et al. (2024) CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds. J Mol Biol 436(17):168551. - PubMed
    1. Schaeffer RD, Medvedev KE, Andreeva A, Chuguransky SR, Pinto BL, Zhang J, et al. (2025) ECOD: integrating classifications of protein domains from experimental and predicted structures. Nucleic Acids Res 53(D1):D411–D8. - PMC - PubMed

Publication types

LinkOut - more resources