Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 May 28;30(30):e202400660.
doi: 10.1002/chem.202400660. Epub 2024 Apr 30.

Secondary Sites of the C-type Lectin-Like Fold

Affiliations
Review

Secondary Sites of the C-type Lectin-Like Fold

Jonathan Lefèbre et al. Chemistry. .

Abstract

C-type lectins are a large superfamily of proteins involved in a multitude of biological processes. In particular, their involvement in immunity and homeostasis has rendered them attractive targets for diverse therapeutic interventions. They share a characteristic C-type lectin-like domain whose adaptability enables them to bind a broad spectrum of ligands beyond the originally defined canonical Ca2+-dependent carbohydrate binding. Together with variable domain architecture and high-level conformational plasticity, this enables C-type lectins to meet diverse functional demands. Secondary sites provide another layer of regulation and are often intricately linked to functional diversity. Located remote from the canonical primary binding site, secondary sites can accommodate ligands with other physicochemical properties and alter protein dynamics, thus enhancing selectivity and enabling fine-tuning of the biological response. In this review, we outline the structural determinants allowing C-type lectins to perform a large variety of tasks and to accommodate the ligands associated with it. Using the six well-characterized Ca2+-dependent and Ca2+-independent C-type lectin receptors DC-SIGN, langerin, MGL, dectin-1, CLEC-2 and NKG2D as examples, we focus on the characteristics of non-canonical interactions and secondary sites and their potential use in drug discovery endeavors.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interests

The authors declare no conflict of interest.

Figures

None
C-type lectins, a large superfamily of proteins, are crucial in immunity and homeostasis. Their C-type lectin-like domain enables them to bind a wide range of ligands, including those not dependent on Ca2+ interactions. This adaptability is further enhanced by their variable domain architecture and conformational plasticity. Secondary binding sites, often located away from the primary site, contribute to the C-type lectin functional diversity by accommodating different ligands, thus offering potential for drug discovery by fine-tuning biological responses and increasing selectivity.
Figure 1
Figure 1. Structure of the C-type lectin-like domain (CTLD).
(a) The C-type lectin-like fold with secondary structure element numbering using DC-SIGN (PDB ID: 1SL4) as an example. The long loop region is highlighted in blue. Disulfide bonds are shown as sticks. Ca2+ ions are shown in green spheres. The fourth Ca2+ site located between α2, β1, and β5 is not shown. Secondary structure element numbering according to Zelensky and Gready.[3] (b) The EPN and WND motifs (top) in the long loop region and β4, respectively, coordinate the Ca2+ ion in Ca2+ site II. The WIGL (WMGL in DC-SIGN) motif (bottom) in β2 forms the hydrophobic core. DC-SIGN (PDB ID: 1SL4) is shown as an example CTLD. (c) Structural alignment of 38 unique PDB IDs of human CTLDs. While the CTLD fold is structurally highly conserved, the long loop region, harboring the Ca2+ sites and the canonical CBS, are structurally diverse. Coloring of the structures according to RMSD between X-ray crystallographic structures.
Figure 2
Figure 2. Canonical glycan binding at Ca2+ site II and the extended canonical CBS using glycolipid recognition by bovine mincle (PDB ID: 4ZRV) as an example.
(a) Canonical glycan binding is dependent on Ca2+ site II in the long loop region. Bovine mincle in complex with trehalose monobutyrat is shown. Ca2+ ions are shown as green spheres. (b) The EPN motif (E168, P169, N170) together with the WND (W191, N192, D193) and E176 mediate interaction with the first glucose (Glc-1) moiety of trehalose monobutyrat via coordination of Ca2+ in Ca2+ site II together with 3- and 4-OH of Glc-1. (c) Extended glycolipid binding site of mincle. The primary carbohydrate site (1) coordinates Glc-1, as shown in (b), while the proximate carbohydrate site (2) interacts with Glc-2. The acyl chain of the trehalose monobutyrat is not fully resolved but points towards a hydrophobic grove designated as the primary lipid site (3). A second lipid site (4) was also proposed.[–57]
Figure 3
Figure 3. Structure and ligand binding sites of the LOX-1 CTLD.
Front (a) and top (b) view of the X-ray crystallographic structure of the LOX-1 homodimer (PDB ID: 1YPQ). The hydrophobic tunnel bound to a dioxane molecule at the interface of the LOX-1 subunits was proposed to function as a lipid binding site. Arginine residues forming the basic spine necessary for oxLDL binding are highlighted as green sticks. Secondary structure numbering according to Zelensky and Gready.[3] LOX-1 inhibitor BI-0155 (c) stabilizes the inactive tetramer state through head-to-head cross-linking of two LOX-1 dimers (PDB ID: 6TL9). (e) Close-up of the small molecule binding site. Two BI-1055 (green sticks) molecules bind two dimers LOX-1A/B and LOX-1C/D. The basic spine, as highlighted by residues R248B and R248D, is partly covered, preventing interaction with oxLDL.
Figure 4
Figure 4. Canonical glycan binding and secondary sites in DC-SIGN.
(a) High-mannose type oligosaccharide recognition at the extended canonical CBS using the X-ray crystallographic structure of DC-SIGN in complex with GlcNAc2Man3 (PDB ID: 1K9I). Residues involved in Ca2+-coordination of the central mannose (Man) and formation of the extended canonical CBS for interactions with the oligosaccharide are shown as sticks. Residues F313, S360, and V351, which are mostly responsible for forming additional interactions, are highlighted. Ca2+ ions are shown as green spheres. (b) Fucose-type oligosaccharide recognition at the extended canonical CBS using the X-ray crystallographic structure of DC-SIGN in complex with lacto-N-fucopentaose III (PDB ID: 1SL5). Residues involved in Ca2+-coordination and formation of the extended CBS for interactions with the oligosaccharide are shown as sticks. V351 is highlighted as a contributor to the recognition of the central fucose (Fuc). Ca2+ ions are shown as green spheres. (c) Proposed druggable secondary sites in DC-SIGN (PDB ID: 1K9I) and examples of fragment hits for each site. The GlcNAc2Man3 oligosaccharide indicates the canonical CBS. Figure adapted from Aretz et al.[124]
Figure 5
Figure 5. Proposed mechanism of selectivity of heteromultivalent liposomes carrying mannose and a biphenyl mannoside for DC-SIGN (PDB ID: 1SL4).
While the biphenyl moiety of the biphenyl mannoside interacts with a previously identified secondary site centered by residue M270 (shown in blue), the mannose moiety predominantly binds to the canonical CBS of DC-SIGN (shown in green). This leads to chelation-derived avidity enhancement and allosteric activation of the canonical CBS or its associated Ca2+ sites, driving selectivity of the biphenyl mannoside for DC-SIGN.[13]
Figure 6
Figure 6. Carbohydrate binding mode, function of the loops, and secondary binding sites in langerin.
(a) Overlay of mannose, glucosamine, and fucose (in Blood group B trisaccharide) interacting with langerin. The hydroxy groups coordinating the Ca2+ are all in equatorial configuration (PDB ID: 3P7G, 4N32, 3P5G). Ca2+ is shown as a green sphere, and the residues coordinating Ca2+, E285, N287, E293, and N307 are shown as sticks. (b) The short loop of langerin differs in conformation between human (blue) and murine (yellow) homologs (PDB ID: 3C22, 5 K8Y). (c) The surface charge on langerin shows the negatively charged canonical CBS and the positively charged extended canonical CBS, enabling salt bridges to form with 6-sulfated galactose. Surface charge was calculated using the APBS webserver.[195] (d) Interaction details of 6-sulfated galactosamine; residues involved in the recognition are shown as sticks; salt bridges, and hydrogen bonds are shown in yellow (PDB ID: 3P5I). (e) Proposed allosteric inhibition ‘switch’ mechanism of thiazolopyrimidine binding to murine langerin. Adapted from Aretz et al.[14] (f) The tryptophan (in green) of a Strep-tag from the neighboring langerin unit located at the cleft formed between the short loop and loop of langerin (PDB ID: 3P7H).
Figure 7
Figure 7. Carbohydrate binding mode, extended canonical CBS, and remote secondary site of the MGL CRD.
(a) X-ray crystallographic structure of MGL bound to N-acetylgalactosamine (PDB ID: 6PY1). Residues related to carbohydrate recognition and Ca2+-coordination are shown as sticks. Ca2+ is shown as a green sphere. (b) Chemical shift perturbations (CSPs) in 1H-15N HSQC spectrum induced by GM2 are mapped on PDB ID: 6PY1. Adapted from Diniz et al.[68] The canonical CBS residues are colored in red; the extended canonical CBS residues are colored in orange (CSP > 0.05 ppm) and yellow (CSP< 0.05 ppm). (c) Remote secondary site revealed by 1H-15N BEST-TROSY spectra of MGL. CSPs induced by E. coli R1 lipooligosaccharide are mapped on the CTLD of MGL (PDB ID: 6PY1). The CBS is shown on the left, and the remote secondary site is shown on the right. Residues inducing large perturbations (CSP>0.018 ppm) are shown in green.[101]
Figure 8
Figure 8. β-glucan binding site and proposed oligomerization site mapped on the murine dectin-1 CTLD (PDB ID: 2BPE).
Amino acids of the hydrophobic group forming the putative non-canonical carbohydrate recognition site for β-glucans in the upper lobe of the CTLD are labeled green.[99,100] Amino acids important for ligand-induced cooperative oligomerization of the CTLD are labeled in blue.[100] Secondary structure elements were numbered according to the nomenclature proposed by Zelenksy and Gready.[3]
Figure 9
Figure 9. PDPN recognition of CLEC-2 via a non-canonical binding site.
(a) Crystal structure of the CLEC-2 CTLD (white) in complex with PDPN-derived glycopeptide (green) (PDB ID: 3WSR) showing the binding site of PDPN located in the lower lobe of the CLEC-2 CTLD between α1, α2, β0, and β1. Secondary structure elements were numbered according to the nomenclature proposed by Zelensky and Gready.[3] (b) Amino acids sequence of PDPN glycopeptide residues that show electron density in the crystal structure with di-sialylated core I O-glycan attached to T52 (the α2-3 linked N-acetylneuraminic acid was not resolved in the structure) (left) and close up of the PDPN binding site of CLEC-2 (right). The glycopeptide is mostly recognized via two acidic amino acids (E47 and D48) and the α2-6 linked N-acetylneuraminic acid. Amino acids of CLEC-2 that are important for the interaction are shown as sticks and labeled.
Figure 10
Figure 10. Distinct protein and small-molecular binding sites of NKG2D.
(a) The NKG2D homodimer is blue and white in complex with green MICA (PDB ID: 1HYR). Secondary structure elements were numbered according to Li et al.[105] Interaction hotspots Y152 and Y199 for the recognition of protein ligands by NKG2D are shown as sticks. (b) The NKG2D homodimer in complex with small-molecule ligand 3e (PDB ID: 8EA6) occupying a cryptic pocket in the interaction interface of the two subunits. (c) Chemical structure of NKG2D ligand 3e. (d) Close-up of the 3e binding site. The ligand forms hydrogen bonds with L148A, K150A, and K150B, replacing reciprocal backbone interactions of the unbound dimer. F113B is the only sidechain affected by ligand binding and forms π–π stacking interactions with the phenyl rings of 3e. Hydrogen bonds are shown as dashed lines.
Figure 11
Figure 11. Summary of the protein X-ray crystallographic structures shown in this review illustrating the different binding sites utilized by the CTLD fold.
The backbone of the CTLD is displayed in cartoon representation with Ca2+ ions depicted as green spheres; complexed ligands are hidden for clarity. Glycan-interaction sites are colored green, small molecule binding sites are colored blue, and protein-protein interaction sites are shown in pink. Side chains of amino acids highlighted by any color are depicted as sticks. CTLDs of group II CLRs are shown in the upper row (a-d), and CTLDs of group V CLRs are shown in the lower row (e-h): (a) (bovine) mincle (PDB ID: 4ZRV) shows canonical carbohydrate recognition; amino acids with 5 Å of bound glucose (not shown) are colored green.[56] (b) DC-SIGN (PDB ID: 1 K9I) binds carbohydrates via the canonical CBS. For oligosaccharide ligands, the binding site is extended by a proximal binding pocket. Amino acids with 5 Å of bound GlcNAc2Man3 (not shown) are highlighted in green to illustrate this extended binding site.[168] In addition, DC-SIGN binds a biphenyl mannoside via a Ca2+-independent binding site in the lower lobe of the CTLD mapped by 1H-15N HSQC NMR and shown in blue.[13] Four additional binding sites for drug-like fragments have been experimentally verified (not shown).[124] (c) (murine) langerin (PDB ID: 3P5D) binds carbohydrates in the canonical CBS, amino acids within 5 Å of glucose (not shown) are depicted in green. The binding site for thiazolopyrimidines in the loop region of the protein was mapped by solution paramagnetic relaxation enhancement in 1H-15N HSQC NMR and is shown in blue.[132] The recognition of sulfated sugars via an extended binding site and a remote secondary binding site on human langerin are not shown.[189,191] (d) MGL (PDB ID: 6PY1) binds carbohydrates in the canonical CBS; amino acids within 5 Å of N-acetylgalactosamine (not shown) are colored in green.[207] A remote secondary site utilized for binding of E. coli R1 lipooligosaccharide (LOS) in the lower lobe of the CTLD was mapped by CSP in 1H-15N BEST-TROSY NMR and is shown in dark green.[101] (e) (murine) dectin-1 (PDB ID: 2BPE) shows Ca2+-independent recognition of β-glucans. Essential amino acids forming the putative binding site in the upper lobe of the protein are colored green.[99,100] (f) CLEC-2 (PDB ID: 3WSR) exhibits binding of a glycopeptide of PDPN via a remote binding site in the lower lobe of the CTLD. Amino acids making up the binding locus for the peptide part of the ligand (not shown) are colored pink, and amino acids making up the binding locus for the carbohydrate part of the ligand (not shown) are in green. R118 is part of both loci and is shown in orange.[98] (g) NKG2D (PDB ID: 8EA6) has a binding site for a small-molecule ligand in the interaction interface of the physiologically functional homodimer (both small molecule and second monomer not shown). Amino acids within 5 Å of the small molecule are colored blue.[135] The small molecule ligand prevents binding of the NKG2D homodimer to endogenous protein ligands containing an α1α2 platform domain. Amino acids within 5 Å of MICA in the complex with the NKG2D dimer (PDB ID: 1HYR; not shown) are highlighted in pink.[105] (h) LOX-1 (PDB ID: 6TL9) is a functional homodimer (second monomer not shown) that recognizes protein ligands like oxLDL via the upper lobe of the domain. Amino acids of the so-called basic spine that mediates this interaction are shown in pink.[40,41] Further, LOX-1 binds small-molecule ligand BI-0155 (not shown) in an additional binding site located in the upper lobe of the domain. Two BI-0155 molecules link two LOX-1 dimers, forming an inactive tetramer.[89]

References

    1. Varki A. Glycobiology. 2017;27(1):3–49. doi: 10.1093/glycob/cww086. - DOI - PMC - PubMed
    1. Varki A, Cummings RD, Esko JD, Stanley P, Hart GW, Aebi M, et al., editors. Essentials of Glycobiology. 4th Cold Spring Harbor (NY); 2022. - PubMed
    1. Zelensky N, Gready JE. J Febs. 2005;272(24):6179–217. - PubMed
    1. McMahon SA, Miller JL, Lawton JA, Kerkow DE, Hodes A, Marti-Renom MA. Nat Struct Mol Biol. 2005;12(10):886–92. - PubMed
    1. Geijtenbeek TBH, Krooshoop DJEB, Bleijs DA, Vliet SJv, Grabovsky V, et al. Nat Immunol. 2000;1(4):353–7. - PubMed

MeSH terms

LinkOut - more resources