Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb;19(2):151-158.
doi: 10.1038/s41589-022-01142-z. Epub 2022 Oct 17.

Structural basis of colibactin activation by the ClbP peptidase

Affiliations

Structural basis of colibactin activation by the ClbP peptidase

José A Velilla et al. Nat Chem Biol. 2023 Feb.

Abstract

Colibactin, a DNA cross-linking agent produced by gut bacteria, is implicated in colorectal cancer. Its biosynthesis uses a prodrug resistance mechanism: a non-toxic precursor assembled in the cytoplasm is activated after export to the periplasm. This activation is mediated by ClbP, an inner-membrane peptidase with an N-terminal periplasmic catalytic domain and a C-terminal three-helix transmembrane domain. Although the transmembrane domain is required for colibactin activation, its role in catalysis is unclear. Our structure of full-length ClbP bound to a product analog reveals an interdomain interface important for substrate binding and enzyme stability and interactions that explain the selectivity of ClbP for the N-acyl-D-asparagine prodrug motif. Based on structural and biochemical evidence, we propose that ClbP dimerizes to form an extended substrate-binding site that can accommodate a pseudodimeric precolibactin with its two terminal prodrug motifs in the two ClbP active sites, thus enabling the coordinated activation of both electrophilic warheads.

PubMed Disclaimer

Conflict of interest statement

E.P.B. and M.R.V. are listed as inventors on a provisional patent (US application 63/135,825) which relates to the methods and ClbP inhibitors described in Reference 22. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The TMD of ClbP completes the substrate-binding site.
a, The proposed structure of colibactin is pseudodimeric and contains two electrophilic warheads that generate inter-strand cross-links in the DNA of epithelial cells in the human gut. To activate this toxin, the ClbP peptidase cleaves off the two prodrug motifs (colored in magenta) from the precursor molecule precolibactin, leading to non-enzymatic condensation to form the active warheads (curved arrows). b, The structure of full-length ClbP reveals an interface between the periplasmic and transmembrane domains. The inset on the left provides an expanded view of the interdomain interface. The conserved TMD residues and the catalytic triad are shown as sticks. The inset on the right shows interactions of the β3-β4 loop (dark yellow) with the TMD that likely stabilize the orientation of the catalytic site toward the cell membrane.
Fig. 2
Fig. 2. The prodrug motif binds at the interface between periplasmic and transmembrane domains.
a, Substrate analog included in crystallization of catalytically inactive ClbP. Our data suggest that this molecule is hydrolyzed during crystallization, as the atoms in gray are not observed in the electron density map. b, Two views, related by a 90° rotation, of the hydrolysis product bound at the active site. The d-asparagine sidechain of the prodrug motif interacts with periplasmic domain residues S188, H257 and N331, and the acyl chain interacts with TM2–TM3 linker residues F462 and W466 (sidechains shown as sticks). Polder map omitting the product contoured at 7σ is colored in cyan, and bromine anomalous difference Fourier map contoured at 3.5σ is colored in purple. c, Enzymatic activity of purified ClbP variants measured as cleavage of a fluorogenic substrate analog (Extended Data Fig. 3c). The plot represents triplicate measurements normalized to the average for wild-type (WT) ClbP. Source data
Fig. 3
Fig. 3. N331 enforces d-asparagine specificity.
a, A network of interactions initiated by K235 orients N331 such that the carbonyl in its sidechain faces toward the binding pocket (the cartoon representation of residues 255–261 is transparent to optimize the view). b, Activity assay with substrate analogs containing prodrug motifs with alternative d-amino acids. Cleaved prodrug motif is detected by LC–MS (normalized to AUC of S95A) after a 5-hour incubation of the substrate with purified ClbP variants. c, Results of the assay in b for the substrate analogs containing d-Asn (left) or d-Asp (right); n = 3 independent experiments. None of the ClbP variants cleaved substantial amounts of the d-Gln-containing substrate analog (Extended Data Fig. 4d). Perturbing the orientation of the N331 sidechain allows ClbP to cleave d-aspartate substrates, suggesting that this residue is crucial for substrate specificity. Representative traces from the LC–MS are shown in Extended Data Fig. 4c. WT, wild-type. Source data
Fig. 4
Fig. 4. ClbP forms a dimer that accommodates pseudodimeric precolibactin.
a, ClbP dimer observed in the crystal-packing interactions from a plane perpendicular to the cell membrane, denoted as black lines. b, Orthogonal view of the dimer interface looking from the periplasm to the inside of the cell. The interface forms around a two-fold crystallographic symmetry axis (black oval) and consists of a pair of interlocking loops that contribute both hydrophobic and polar interactions. The largest predicted energetic contributors to stabilizing this interface are interactions formed among residues R308, Y324 and D367 (shown as thick sticks). All other residues participating in the interface are shown as thin sticks. c,d, Detailed view of interactions mediated by R308 (c) and K374 (d). e, 3D reconstruction obtained from cryo-EM analysis of wild-type ClbP. Density colored to correspond to each subunit, and the detergent micelle is shown as a transparent surface with dust hidden for clarity. f,g, Model of precolibactin binding to the ClbP dimer obtained by individually docking fragments of the molecule (Supplementary Fig. 5). Precolibactin can straddle both subunits of the ClbP dimer such that the prodrug motifs at both ends can each bind a different active site simultaneously. Views of precolibactin binding to the dimer as seen from a plane perpendicular to the membrane (f) as well as to the surface of the cavity subtended by the dimer (g). Note that the docked molecule contains hexanoyl chains in place of the natural tetradecanoyl (or ‘C14’) chains of the myristoyl groups.
Extended Data Fig. 1
Extended Data Fig. 1. Identification and sequence conservation of prodrug-activating homologs of ClbP.
a, Sequence similarity network (SSN) for 730 ClbP homologs, colored by identified BGC (if any). Peptidases involved in amicoumacin, edeine, paenilamicin, and zwittermicin biosynthesis cluster together, along with the newly identified probable Gram-positive colibactin producers. The Gammaproteobacterial ClbPs are split into two distinct subsets, one comprising close relatives of the sequences found in Pseudovibrio strains and the other representing ClbPs from BGCs with canonical architecture in Escherichia species (with homologs from Erwinia, Frischella, Gilliamella, and Serratia strains, among others). Similarly, AmiB homologs in Xenorhabdus strains cluster with XcnG sequences rather than Gram-positive AmiBs. Intriguingly, the genomes of some strains – such as the edeine-producing B. formosus NF2 – appear to have multiple biosynthetic gene clusters containing authentic prodrug peptidases with potentially distinct activities. b, SSN colored to highlight proximity (within a ±10 gene neighborhood) of the peptidase gene to a gene containing both NRPS A and C domains, as a proxy for the presence of a NRPS module playing a ClbN-like role. The only SSN clusters in which this condition was met were clusters containing homologs of known prodrug peptidases (circled in black). All our sequence conservation analyses were performed using these clusters. c, SSN colored by phylum, highlighting that prodrug-activating peptidases are most common among Firmicutes, with some spread into Proteobacteria (as seen with the ami, clb, and xcn BGCs) and into Actinobacteria. d, The prodrug-activating peptidase SSN is colored by amino acid sequence length to emphasize that a large subset of sequences (including EdeA, PamJ, and ZmaM homologs) are much longer. This can be attributed to fusion with a second domain with homology to components of an ABC transporter, commonly annotated as a cyclic peptide transporter. However, Gram-positive AmiB and ClbP sequences (and other unidentified but related proteins) lack this additional domain and more closely resemble E. coli ClbP, X. bovenii AmiB, and XcnG. e, Sequence logos built from the alignment of 271 candidate prodrug-activating peptidases detail sequence conservation of the catalytic triad and of periplasmic-TMD interface and intra-TMD positions discussed in the main text.
Extended Data Fig. 2
Extended Data Fig. 2. Comparison of the structures of ClbP in the presence of monoolein and a product analog illustrates that monoolein mimics a ClbP substrate or product.
a, Monoolein in the crystallization mesophase was trapped in the active site of one of our structures. The panels show two views, related by a 180° rotation, of a monoolein molecule (cyan) bound in the active site, with the corresponding electron density for a polder map contoured at 7σ. b, A side-by-side comparison of the active-site interactions of the (4-(4-bromophenyl)butanoyl)-d-asparagine product and monoolein illustrates that monoolein interacts similarly with active site residues that bind to the prodrug motif, explaining how it can outcompete substrate analogs introduced only in the precipitant solution but not the lipidic mesophase during the crystallization process. Hydrogen bonds are indicated as black dotted lines. c, In addition to the hydrolysis product in the active site (cyan sticks), we observed electron density corresponding to an intact substrate molecule at an adjacent site (cyan spheres). d, The inset shows sidechains within 4.2 Å of the bound substrate analog, with hydrogen bonds indicated as black dotted lines. The corresponding electron density for a polder map is contoured at 7σ.
Extended Data Fig. 3
Extended Data Fig. 3. Activity of ClbP variants with mutations to conserved substrate-binding residues measured using a fluorogenic assay.
a, Sequence logos representing conservation of N-acyl-d-asparagine binding residues among 271 aligned sequences of prodrug-activating homologs (top), compared to logos built from an alignment of 901 representative sequences from the broader S12 family downloaded from the MEROPS database. b, Fluorogenic activity assay used to measure the peptidase activity of ClbP variants. Cleavage of the synthetic substrate probe by ClbP generates an intermediate which then undergoes a non-enzymatic cyclization reaction to yield the active fluorophore (gray box). c, Curves of the raw fluorescence versus time for different ClbP variants with point mutations at residues of interest that interact with the substrate (top row) or form notable interdomain (K240A and F243A) or intra-TMD (W460) interactions (see Extended Data Fig. 1e for the corresponding sequence logos). Each panel represents triplicate measurements for the indicated variant (cyan). For comparison, the corresponding triplicate measurements for wild-type ClbP (black) and catalytically inactive S95A (gray), measured in the same experiment, are reproduced on each graph. The two gray vertical lines bound the data used for calculating the normalized hydrolysis rates in Fig. 2. Source data
Extended Data Fig. 4
Extended Data Fig. 4. Wild-type ClbP and variants with mutations at d-asparagine binding residues cannot process substrate with an N-acyl-d-glutamine prodrug motif.
a, Sequence logo representing conservation of E92, which stabilizes the orientation of d-asparagine specificity residue N331. b, Normalized hydrolysis rates calculated from activity assays performed with d-asparagine binding mutants E92Q and S188N (left). Triplicate fluorescence activity measurements for each mutant (cyan) are shown with triplicates for wild-type ClbP (black) and catalytically inactive S95A (gray) collected in the same experiment (right). c, Assay measuring activity of mutants for dipeptide substrates containing d-asparagine, d-aspartate, or d-glutamine prodrug motifs by LC–MS detection of cleaved product. Extracted Ion Chromatogram (EIC) traces of the [M + H]+ ion for each of the substrates tested (left) and the expected ClbP cleavage product (right). d, None of the d-asparagine binding mutants process substantial amounts of the d-glutamine substrate, as indicated by the lack of a difference in activity between the catalytically deficient S95A and any of the other variants (n = 3 independent experiments). Source data
Extended Data Fig. 5
Extended Data Fig. 5. Crystal packing interactions suggest ClbP forms a dimer.
a, The ClbP dimer (left) and the dimer in context of the crystal packing arrangement (right), with the asymmetric units containing each subunit in yellow or green and other symmetry mates in gray. b, The same dimeric arrangement observed in our full-length ClbP structure is also present as a crystallographic dimer interface in the published structure of the isolated periplasmic domain (top), as seen in the interaction between two trimeric asymmetric units (bottom). c, 2D representation of the ClbP dimer interface detailing the molecular interactions stabilizing the assembly as well as distances (in Å) between interacting polar groups.
Extended Data Fig. 6
Extended Data Fig. 6. Cryo-EM data analysis of ClbP.
a, Size exclusion chromatogram of wild-type ClbP purified in GDN. Protein eluted in the fraction bound by the vertical lines was used for cryo-EM analysis. b, Representative micrograph of ClbP embedded in vitreous ice (scale bar = 600 Å; from 3888 micrographs), low pass and high pass filtered for clarity. c, Selected 2D class averages of ClbP. d, Reconstruction of ClbP filtered and colored by local resolution. Complete data analysis procedure is in Extended Data Fig. 7a. e, Gold-standard Fourier shell correlation (FSC) curves from cryoSPARC. f, Viewing direction distribution plot. Source data
Extended Data Fig. 7
Extended Data Fig. 7. Cryo-EM data processing procedure and model of dimeric ClbP.
a, Processing scheme for classification and refinement of ClbP. Locally filtered map with dust hidden used for the final reconstruction for clarity. b, Superposition of the cryo-EM (tan) and crystal (gray) structures of dimeric ClbP. Both structures are nearly identical (RMSD of 0.830 Å over 904 residues), except for a ~5° bend of the TMDs towards the center of the dimer cavity in the cryo-EM structures. c, Superposition of the cryo-EM and crystal structures focusing on a single subunit. View related to b by a 90° rotation. d, The 3D reconstruction of dimeric ClbP revealed a branched density in the active site that likely corresponds to a copurified lipid. Density is contoured at 5.5σ and the catalytic triad residues are shown as sticks for reference.
Extended Data Fig. 8
Extended Data Fig. 8. The dimer interface is important for the stability of ClbP.
a, Superposed normalized size exclusion chromatograms of wild-type ClbP and variants with mutations at the dimer interface performed on a Superdex 200 10/300 column. While no mutation yields a detectable monomeric species, R308E induces the formation of higher molecular weight aggregates. The vertical line indicates the elution volume of dimeric ClbP. b, Enzyme activity measurements of dimer interface variants normalized to the wild-type average, using the in vitro fluorogenic activity assay (number of experimental replicates: n = 13 (WT and S95A), 6 (R308E) or 3 (all others)). c, Raw fluorescence versus time curves for activity assays performed with dimer interface mutants. Each panel represents replicate measurements (n = 3 for R308A, Y324A, D367A, and K374E and n = 6 for R308E) for the indicated variant (cyan). For comparison, the corresponding replicate measurements for wild-type ClbP (black) and catalytically inactive S95A (gray), measured in the same experiment, are reproduced on each graph. The two gray vertical lines bound the data used for calculating the normalized hydrolysis rates in b. d, Superposed normalized size exclusion chromatograms of two constructs that replace residues 299-310 or 304-308, respectively, of the longest interface loop with a two-glycine linker. Both constructs elute primarily as higher molecular weight aggregates, suggesting the dimer interface is crucial for the integrity of biochemically isolated ClbP. Source data
Extended Data Fig. 9
Extended Data Fig. 9. ClbP dimerizes through loops that are not highly conserved.
a, Sequence logo representing conservation of dimer interface regions highlighted in panel b on the structure and in Supplementary Figure 3 on the sequence among 15 ClbP homologs from colibactin biosynthetic clusters. Residues predicted to be important for dimerization are not strongly conserved, suggesting that the mode of dimerization we observe in E. coli ClbP may be an adaptation of Proteobacterial ClbP. b, ClbP dimer interface highlighting the α8-β11 loop region (residues 296-324; blue) and α11 helix (357-372; red). c, Unrooted sequence similarity tree of S12 homologs with structures deposited in the PDB. The inset details the homologs in the same clade as ClbP. d, Equivalent views of the α8-β11 loop region (blue) in the structures of homologs in the same clade as ClbP. The α8-β11 loop region are highly variable in structure in the S12 homologs. These regions only mediate formation of a dimer in ClbP and FmtA (PDB ID: 5ZH8), but the dimer geometry is different and only the ClbP dimer has the active sites of the two subunits facing each other on either side of the substrate-binding cavity. The catalytic triad and product analog of ClbP are shown as sticks for context.

References

    1. Nougayrede JP, et al. Escherichia coli induces DNA double-strand breaks in eukaryotic cells. Science. 2006;313:848–851. - PubMed
    1. Buc E, et al. High prevalence of mucosa-associated E. coli producing cyclomodulin and genotoxin in colon cancer. PLoS ONE. 2013;8:e56964. - PMC - PubMed
    1. Arthur JC, et al. Microbial genomic analysis reveals the essential role of inflammation in bacteria-induced colorectal cancer. Nat. Commun. 2014;5:4724. - PMC - PubMed
    1. Cuevas-Ramos G, et al. Escherichia coli induces DNA damage in vivo and triggers genomic instability in mammalian cells. Proc. Natl Acad. Sci. USA. 2010;107:11537–11542. - PMC - PubMed
    1. Arthur JC, et al. Intestinal inflammation targets cancer-inducing activity of the microbiota. Science. 2012;338:120–123. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources