Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 14;173(7):1622-1635.e14.
doi: 10.1016/j.cell.2018.04.028. Epub 2018 May 17.

The Eukaryotic Proteome Is Shaped by E3 Ubiquitin Ligases Targeting C-Terminal Degrons

Affiliations

The Eukaryotic Proteome Is Shaped by E3 Ubiquitin Ligases Targeting C-Terminal Degrons

Itay Koren et al. Cell. .

Abstract

Degrons are minimal elements that mediate the interaction of proteins with degradation machineries to promote proteolysis. Despite their central role in proteostasis, the number of known degrons remains small, and a facile technology to characterize them is lacking. Using a strategy combining global protein stability (GPS) profiling with a synthetic human peptidome, we identify thousands of peptides containing degron activity. Employing CRISPR screening, we establish that the stability of many proteins is regulated through degrons located at their C terminus. We characterize eight Cullin-RING E3 ubiquitin ligase (CRL) complex adaptors that regulate C-terminal degrons, including six CRL2 and two CRL4 complexes, and computationally implicate multiple non-CRLs in end recognition. Proteome analysis revealed that the C termini of eukaryotic proteins are depleted for C-terminal degrons, suggesting an E3-ligase-dependent modulation of proteome composition. Thus, we propose that a series of "C-end rules" operate to govern protein stability and shape the eukaryotic proteome.

Keywords: C terminus; CRL; Cullin; DesCEND; E3 ubiquitin ligase; GPS; degron; global protein stability; protein degradation; ubiquitination.

PubMed Disclaimer

Figures

Figure 1
Figure 1. CRISPR screening combined with GPS profiling of a synthetic human peptidome identifies multiple CRL2 complexes targeting unstable peptides
(A) Overview of GPS-peptidome library construction and screening pipeline. (B) Stabilization of unstable GFP-peptide fusions in the Bin1 population upon treatment for 5 h with 5 mM of the proteasome inhibitor MG132, or 100 nM of the lysosomal inhibitor Bafilomycin A1 (BafA1), as assessed by FACS. (C to E) Isolation of CRL substrates. (C) Bin1 population was treated for 5 h with 1 μM of MLN4924. Cells expressing stabilized GFP-peptide fusion peptides were then purified by FACS, and, after recovery, reanalyzed following MLN4924 treatment (D). Expression of DN Cullins for 24 h followed by FACS analysis (E, see also Figure S1D). (F) The heatmap represents the CRISPR screen MAGeCK scores for the indicated genes across each of the 17 individual clones. The full data for each clone is shown in Figure S2. (G) CRISPR-mediated ablation of the indicated genes in selected clones resulted in the stabilization of the GFP-peptide fusion proteins. An sgRNA targeting AAVS1 was used as a negative control. (H to J) KLHDC3 recognizes glycine-ended substrates. Six of the KLHDC3 substrates identified in the CRISPR screen terminated in glycine (H). For one example substrate (clone 6), mutation of the terminal glycine stabilized the GFP-peptide fusion (I), as did repositioning the glycine by adding an additional 10 amino acids (-DNYNEPKANQ*) at the C-terminus (J). See also Figures S1 and S2.
Figure 2
Figure 2. A GPS-ORFeome screen identifies full-length glycine-ended proteins as CRL2 substrates
(A)Schematic representation of the GPS lentiviral vector. BC- barcode (B) Schematic representation of the GPS-ORFeome screen. (C) Example profiles for one unstable protein (CDK4) and one stable protein (HIST2H2AB) are shown. Each color series represents the distribution of sequencing reads for an individual barcode attached to the same GFP-ORF fusion. (D) Heatmap showing the relative proportions of each amino acid across the last five C-terminal residues of the ORFs stabilized by MLN4924 compared to the whole GPS-ORFeome library. (E) List of high confidence CRL substrates identified from the GPS-ORFeome screen that terminated with glycine. (F) Twelve candidate genes were selected at random from the list in (E), expressed with an N-terminal HA epitope tag in either wild-type (WT) or Cul2 knockout (KO) HEK-293T cells, and protein abundance assessed by immunoblot (IB). (G) For two example GPS-ORF substrates, treatment with 1 μM MLN4924 for 5 h stabilized the wild-type proteins (top row), while mutation of the C-terminal glycine or the addition of one extra residue to the C-terminus resulted in peptide stabilization (bottom row). See also Figure S3.
Figure 3
Figure 3. C-terminal glycine correlates with protein instability and is depleted from eukaryotic proteomes
(A) C-terminal glycine correlates with instability: ORFs terminating in glycine are enriched in Bin1 and depleted from Bin5. Glycine at the terminal (-1) position is depicted in red, while glycine at all other positions in the last ten residues is shown in gray. (B) Normalized amino acid frequencies across the last ten residues of eukaryotic proteins. (C) Normalized frequency of glycine across the last ten residues of proteomes from the indicated taxa. (D) Amino acid proportions across the last ten positions of each proteome are shown. The data for each residue are normalized to the mean proportion across the last ten positions.
Figure 4
Figure 4. The CRL2 adaptors KLHDC2, KLHDC3 and KLHDC10 target distinct C-terminal glycine degrons
(A) Schematic representation of the G-end GPS library screen. (B) Comparison of the amino acid frequencies observed at the -2 position preceding the C-terminal glycine residue among KLHDC2, KLHDC3 or KLHDC10 substrates. (C) Consensus sequences for the C-terminal degrons recognised by KLHDC2, KLHDC3 or KLHDC10. (D) Saturation mutagenesis was performed for two Cul2KLHDC2 substrates (EPHB2 and PDGFC) and for two Cul2KLHDC3 substrates (EMID1 and CHGA). In each case, darker colors represent a greater degree of stabilization conferred by the mutation. (E) Summary of the C-terminal degrons recognized by KLHDC2, KLHDC3 and KLHDC10. (F) Comparison of the normalized frequency at the -2 position of the indicated “favored” (G,R, K, Q, W, P, A) or “disfavored” (D, E, V, I and L) amino acids for recognition by KLHDC2, KLHDC3 and KLHDC10. (G) Depletion of C-terminal glycine is specific to the proteomes of eukaryotes and eukaryotic viruses. See also Figure S4.
Figure 5
Figure 5. Global identification of C-terminal degrons through stability profiling of C-terminal peptides
(A) Schematic representation of the C-termini GPS-peptidome screen. (B) Heatmaps showing the relative depletion (blue) or enrichment (red) of each amino acid across all positions of the 23-mer peptide in the unstable Bin1 population (left) versus the stable Bin4 population (right). (C) Greater numbers of acidic residues correlate with increased stability, while greater numbers of bulky aromatic residues correlate with instability. (D) For all possible combinations of di-peptide motifs, the mean difference in stability between peptides containing the motif at the extreme C-terminus was compared to peptides containing the motif at an internal position in the 23-mer peptide (see Methods). (E) Identification of common classes of potential degron motifs among the top 100 motifs predicted to be most destabilizing specifically when located at the C-terminus. (F) Boxplots showing the distribution of Protein Stability Indices (PSI) for all peptide sharboring the indicated classes of motif internally within the 23-mer peptide (gray boxes) or at the C-terminus (colored boxes). (G) Heatmap showing the relative enrichment (red) or depletion (blue) of amino acids among the C-terminal tails of APPBP2 substrates relative to the whole pool of unstable peptides in the GPS-peptidome library. (H) Logoplots showing the consensus C-terminal amino acid sequences among APPBP2 substrates containing glycine at -2 (top) or -3 (bottom) position. (I) Summary of the C-terminal degron recognized by APPBP2. See also Figures S5, S6 and S7.
Figure 6
Figure 6. Identification of additional classes of C-terminal degrons recognized by CRL2 and CRL4 complexes
(A to C) Heatmaps displaying the relative depletion (blue) or enrichment (red) of each amino acid across the last five C-terminal residues of peptide substrates stabilized following treatment with (A) MLN4924, (B) DN Cul2 or (C) DN Cul4, compared in each case to the whole C-terminal GPS-peptidome library. (D and E) Cells expressing GPS constructs in which GFP is fused to the last 23 residues of (D) CDK5R1, (E) MAGEA3 or NPPB were analyzed either as wild-type, with mutation of the key C-terminal residues or with addition of amino acids (-KASTN*) at the C-terminus, with or without the indicated inhibitor or expression of DN Cullin as indicated. (F) Heatmap representing the degree of enrichment of the indicated genes in the CRIPSR screen as determined by MAGeCK comparing the sorted cells to the unselected populations. (G) Summary of the C-terminal degrons recognized by FEM1A-C, DCAF12 and TRPC4AP. (H) HEK-293T or A375 cells were treated with 1 μM MLN4924 for 8h and protein abundance assessed by immunoblot (IB). (I) HEK-293T cells (TSPYL1, p14ARF or PTOV1 immunoblots) or A375 cells (CCT5 or MAGEA3 immunoblot) were transduced with Cas9 and sgRNAs targeting the indicated genes and protein abundance was assessed by immunoblot 7 days later. (J) HEK-293T cells expressing GFP fused to N-Myc C-terminal 23 residues were transduced with Cas9 and two independent sgRNAs targeting TRPC4AP and analyzed 7 days later by FACS. (K) Mutation of the critical arginine or deletion of the last three residues stabilized the GFP-fusion protein as measured by FACS. See also Figures S6 and S7.
Figure 7
Figure 7. Recognition of C-terminal degrons is a general property of E3s that has shaped the human proteome
(A) Of the top 100 predicted destabilizing C-terminal motifs (Figure 5), 58 are enriched among the pool of CRL substrates while 42 are not. (B) Boxplots showing the distribution of PSI for all peptides harboring the indicated classes of motif internally within the 23-mer peptide (gray boxes) or at the C-terminus (colored boxes). (C) Cells expressing GPS constructs in which the C-terminal 23 residues of the indicated genes comprising representative non-CRL degrons were fused to GFP. These degrons were analyzed either as wild-type, with mutation of the key C-terminal residues, or with addition of amino acids (-KASTN*) at the C-terminus as indicated, with or without the indicated inhibitors. (D) Summary of the non-CRL C-terminal degrons. (E) Normalized amino acid frequency of the indicated residue(s) in the human proteome, showing the degree of depletion at the degron position (colored bars) versus the mean normalized frequency across all other positions in the C-terminal tail (gray bars). (*P<0.05,**P<0.01, ***P<0.001; Fisher's exact test). (F) Heatmap showing the significance of the depletion of each residue across the last five C-terminal residues of the human proteome.

Comment in

References

    1. Arribere JA, Cenik ES, Jain N, Hess GT, Lee CH, Bassik MC, Fire AZ. Translation readthrough mitigation. Nature. 2016;534:719–723. - PMC - PubMed
    1. Bachmair A, Finley D, Varshavsky A. In vivo half-life of a protein is a function of its amino-terminal residue. Science. 1986;234:179–186. - PubMed
    1. Bennett EJ, Rush J, Gygi SP, Harper JW. Dynamics of cullin-RING ubiquitin ligase network revealed by systematic quantitative proteomics. Cell. 2010;143:951–965. - PMC - PubMed
    1. Campos EI, Reinberg D. Histones: annotating chromatin. Annu Rev Genet. 2009;43:559–599. - PubMed
    1. Choi SH, Wright JB, Gerber SA, Cole MD. Myc protein is stabilized by suppression of a novel E3 ligase complex in cancer cells. Genes Dev. 2010;24:1236–1241. - PMC - PubMed

Publication types

MeSH terms