Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 25;16(1):11382.
doi: 10.1038/s41467-025-67450-9.

Multi-scale classification decodes the complexity of the human E3 ligome

Affiliations

Multi-scale classification decodes the complexity of the human E3 ligome

Arghya Dutta et al. Nat Commun. .

Abstract

E3 ubiquitin ligases are vital enzymes that define the ubiquitin code in cells. Beyond promoting protein degradation to maintain cellular health, they also mediate non-degradative processes like DNA repair, signaling, and immunity. Despite their therapeutic potential, a comprehensive framework for understanding the relationships among diverse E3 ligases is lacking. Here, we classify the "human E3 ligome"-an extensive set of catalytic human E3s-by integrating multi-layered data, including protein sequences, domain architectures, 3D structures, functions, and expression patterns. Our classification is based on a metric-learning paradigm and uses a weakly supervised hierarchical framework to capture authentic relationships across E3 families and subfamilies. It extends the categorization of E3s into RING, HECT, and RBR classes, including non-canonical mechanisms, successfully explains their functional segregation, distinguishes between multi-subunit complexes and standalone enzymes, and maps E3s to substrates and potential drug interactions. Our analysis provides a global view of E3 biology, opening strategies for drugging E3-substrate networks, including drug repurposing and designing specific E3 handles.

PubMed Disclaimer

Conflict of interest statement

Competing interests: R.M.B., M.K., K.H., and I.D. are head scientists at the Frankfurt Competence Center for Emerging Therapeutics (FCET), Goethe Center for (high) technology (Go4Tec), Goethe University, Frankfurt am Main, Germany. The remaining authors declare no competing interests. This manuscript reflects the views of the authors, and neither IMI nor the European Union, EFPIA, nor any associated partners are liable for any use that may be made of the information contained herein. I.D. is a founder/shareholder of Vivlion GmbH and a member of its scientific advisory board. I.D. is also a member of the scientific advisory board of the Boehringer Ingelheim Foundation, the expert committee (for international research leader grants) of the Novo Nordisk Foundation, and the advisory board of Cell and Molecular Cell. I.D. was a founder and consultant of Caraway Therapeutics Inc. M.K. is a co-founder, shareholder, and chief officer of Vivlion GmbH.

Figures

Fig. 1
Fig. 1. Diversity of the human E3 ligome.
a A visualization showing the intersections of eight E3 ligases datasets (A1, ⋯  , A8) obtained from existing literature and public repositories. The matrix layout for all intersections of individual datasets is sorted by size. Filled circles and their corresponding bars indicate sets that are part of the intersection and their sizes, respectively. Individual proteins (Xi) from the all eight datasets n=18An=1448 annotated with one or more domains, di, belonging to a set of well-studied catalytic components of E3 enzymes (C = {dc}) were compiled to form the high-confidence E3 ligome, {Xin=18diC}. b Pie chart showing the extent of protein annotations and filtering to identify the catalytic and non-catalytic components of the human E3 ligome. c Distribution of consensus scores for all annotated protein classes reflects cross-dataset reproducibility on E3 ligase catalytic components. The distribution of (d) protein lengths and annotation coverage for (e) all domains and (f) catalytic domains highlights the heterogeneity of the E3 ligome. g Distribution of structural coverage of the E3 ligome at class-level. Barplots (left axis) display the number of available PDB structures for n = 208 RING, n = 21 HECT, and n = 8 RBR proteins. Violin plots (right axis, min, max, median, and mean values with mirrored density estimates on either side) represent distributions of fractional coverage for n = 2001 RING, n = 168 HECT, and n = 90 RBR structures. h The total number of unique GO terms associated with E3 classes indicates their functional vista under biological process BP, cellular component CC, and molecular function MF ontologies. n-values on bars indicate unique proteins with GO terms.
Fig. 2
Fig. 2. Metric learning for E3 ligases.
a Schematic of the metric learning process. b Distribution of various pairwise distance measures spanning the molecular and systems level organization. c Pearson correlation of distance measures indicate orthogonality, mostly r ∈ (− 0.3, 0.3). Distances based on sequence alignment, domain composition, 3D structure (catalytic), and molecular function (marked in blue) are combined into an emergent distance (DPQ) with appropriate weights. d By maximizing element-centric similarity, a measure of the overlap of emergent hierarchical clusters (right) with the ground truth (left) (e) evaluates individual metrics and their linear combinations. f Regression weights (mean± S.D.) corresponding to the four relevant distances as a function of fractional tree cutoff h. 100 clusters with largest SEC were sampled at each value of h to estimate the mean and S.D.
Fig. 3
Fig. 3. Classification of the human E3 ligome.
Unrooted hierarchical tree computed using the optimized emergent distance metric DPQ (scaled branch lengths). The RBR (purple), HECT (orange), and RING classes (blue/green/yellow) are partitioned at h = 0.25 into 1, 2, and 10 families, respectively. Each cluster is defined by shared sequence, domain-architectural (mapped), structural, and functional elements. Boxes show family information, i.e., family name, size, and subfamilies, with representative examples. Grey-filled circles denote bifurcation nodes with ≥ 95% bootstrap support, and * denotes families with a few class-level outliers (3/13).
Fig. 4
Fig. 4. Functional segregation of the E3 ligome.
Volcano plots of Gene essentiality analysis derived from CRISPR screens for (a) catalytic and (b) non-catalytic components of the E3 ligome. c GO enrichment analysis for essential catalytic E3s. d The functional landscape of the E3 ligome (biological processes) is captured by the network of GO annotation clusters. Individual nodes representing GO clusters (20 labeled) are drawn as pie charts (sizeproportional to # of E3s; colored by family enrichment) connected by distinct edges (κ-similarity ≥ 0.3). e The heatmap displays all functional clusters corresponding to family-specific enrichment of E3s (p value estimated using hypergeometric test (two-sided), discrete color scale for p value ≤0.01; white otherwise). Colored triangles show examples of family specific enrichment for (f) K6-linked ubiquitination (purple) and antiviral innate immune response (green), (g) starvation response under 6h EBSS treatment (blue), and (h) DNA damage response under 4h 100 nM CPT treatment (orange). For panels f–h gene essentiality data log2(FC) or DepMap Gene Effect scores (*) are plotted for individual E3s. The ratio denotes the fraction of E3s with experimental evidence (PMIDs) for GO functions directly. g, h panels also show volcano plots of proteomic analysis, highlighting significantly up-regulated and down-regulated proteins (red scatter; adjusted p values were obtained using Benjamini-Hochberg method in two-sided moderated t-tests) with overlapping E3s (colored) and control proteins (blue filled circles).
Fig. 5
Fig. 5. Protein–protein interactions of the E3 ligome.
Representative examples of E3 ligases functioning as a (a) multi-subunit protein complex (CRL) or (b) a standalone enzyme (HECD3). c Venn diagram of pairwise interactions of adaptors, receptors, and scaffold proteins with E3s. d Annotation of 462 E3 ligases into complex, standalone, or unclassified modes of action. e Family-wise mapping of data from (d). f Pairwise E3–substrate interactions for all E3s obtained by integrating data from known ESIs, mapped transient direct and indirect PPIs, and predicted ESIs. g Mapping of the ubiquitinated proteome with E3s (≈ 62%, n = 12464). h Schematic showing substrate categorization into E3-specific, family-specific, and promiscuous classes (left) and their relative distributions mapped onto E3 families (right). i Representative examples for the three types of ESI networks.
Fig. 6
Fig. 6. Druggability map of the E3 ligome.
a Distribution of known E3 handles (extracted from PROTACs, top) and expanded set of E3 binders (potential lead compounds, bottom) targeting E3 families. Individual proteins uniquely targeted by E3 handles (n = 16, black) and E3 binders (n = 40, red) are displayed for each family. Grey-filled boxes (top) show closely related protein targets for E3 handle/PROTAC repurposing. b Reduced 2D UMAP chemical space of E3 handles (n = 96) and E3 binders (n = 13524); sizeproportional to p-ChEMBL value. Compound clusters (colored) within UMAP space represent distinct chemical structures (Cluster centers indexed #1–20) are identified by local density peaks (see Supplementary Figs. 21). c Magnified view of cluster #2 showing dense sub-clusters of compounds targeting multiple proteins. d Log-transformed propensities, LPij of individual compound clusters capture binding likelihood. e Sankey plot showing the map between PDB (3D interaction), ELIOT (pocketome), individual E3 proteins (19/56), and their compound clusters (covering 212/13620) from our small-molecule interaction analysis. f Magnified view of cluster #13 showing the proximity of E3 handles targeting MDM2/4 to SMUF1/BCL6 binders, inferred from compound similarity and clustering in the reduced UMAP representation. g Example of a potential lead compound identified from cluster #4. Ligand 4QH (similar to JQ1) binds to TIF1A bromodomain (PDB code: 4ZQL) and can be developed into a specific E3 handle. Binding site analysis (from ELIOT) indicates a favorable PROTAC score and high similarity to the TRI33 bromodomain pocket (RING8 member).

References

    1. Collins, G. A. & Goldberg, A. L. The logic of the 26s proteasome. Cell169, 792–806 (2017). - DOI - PMC - PubMed
    1. Dikic, I. Proteasomal and autophagic degradation systems. Annu. Rev. Biochem.86, 193–224 (2017). - DOI - PubMed
    1. Berndsen, C. E. & Wolberger, C. New insights into ubiquitin E3 ligase mechanism. Nat. Struct. Mol. Biol.21, 301–307 (2014). - DOI - PubMed
    1. Komander, D. & Rape, M. The ubiquitin code. Annu. Rev. Biochem.81, 203–229 (2012). - DOI - PubMed
    1. Zheng, N. & Shabek, N. Ubiquitin ligases: structure, function, and regulation. Annu. Rev. Biochem.86, 129–157 (2017). - DOI - PubMed

MeSH terms

LinkOut - more resources