Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Sep 7;101(36):13251-6.
doi: 10.1073/pnas.0404833101. Epub 2004 Aug 26.

Large-scale sequencing of the CD33-related Siglec gene cluster in five mammalian species reveals rapid evolution by multiple mechanisms

Affiliations

Large-scale sequencing of the CD33-related Siglec gene cluster in five mammalian species reveals rapid evolution by multiple mechanisms

Takashi Angata et al. Proc Natl Acad Sci U S A. .

Abstract

Siglecs are a recently discovered family of animal lectins that belong to the Ig superfamily and recognize sialic acids (Sias). CD33-related Siglecs (CD33rSiglecs) are a subgroup with as-yet-unknown functions, characterized by sequence homology, expression on innate immune cells, conserved cytosolic tyrosine-based signaling motifs, and a clustered localization of their genes. To better understand the biology and evolution of CD33rSiglecs, we sequenced and compared the CD33rSiglec gene cluster from multiple mammalian species. Within the sequenced region, the segments containing CD33rSiglec genes showed a lower degree of sequence conservation. In contrast to the adjacent conserved kallikrein-like genes, the CD33rSiglec genes showed extensive species differences, including expansions of gene subsets; gene deletions, including one human-specific loss of a novel functional primate Siglec (Siglec-13); exon shuffling, generating hybrid genes; accelerated accumulation of nonsynonymous substitutions in the Sia-recognition domain; and multiple instances of mutations of an arginine residue essential for Sia recognition in otherwise intact Siglecs. Nonsynonymous differences between human and chimpanzee orthologs showed uneven distribution between the two beta sheets of the Sia-recognition domain, suggesting biased mutation accumulation. These data indicate that CD33rSiglec genes are undergoing rapid evolution via multiple genetic mechanisms, possibly due to an evolutionary "arms race" between hosts and pathogens involving Sia recognition. These studies, which reflect one of the most complete comparative sequence analyses of a rapidly evolving gene cluster, provide a clearer picture of the ortholog status of CD33rSiglecs among primates and rodents and also facilitate rational recommendations regarding their nomenclature.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Schematic representation of Siglecs in primates and rodents. Siglecs have one V-set domain (a domain similar to Ig's variable region) and 1–16 C2-set domains (domains similar to Ig's constant region), followed by transmembrane and cytoplasmic domains. Genes for Sialoadhesin/Siglec-1, CD22/Siglec-2, and myelin-associated glycoprotein/Siglec-4 are located outside of the Siglec gene cluster in both primates and rodents. Clear orthologs have been established for each of these genes between human and mouse. Most of the genes for CD33rSiglec subfamily are in the Siglec cluster described here, with the exception of primate Siglec-11 and rodent Siglec-H, whose genes are outside of the gene cluster (indicated with square brackets). CD33rSiglecs are further classified into five subgroups (V1C1, V1C2, V1C3, V1C4, and V2C2), based on the number of V- and C2-set Ig-like domains. The basic configuration of the V2C2 subgroup (Siglec-7 and 12/XII) is V1 + V1C2, and the V1C2 part is highly similar to other Siglecs with V1C2 configuration and thus can be considered a part of the V1C2 subgroup. Although the primate SIGLEC6 gene has V1C3 configuration similar to that of SIGLEC5, the exon coding for a potential third C2-set domain is inactivated. Similarly, the primate SIGLEC7 gene has the exon coding for a potential second V-set domain inactivated. These are indicated with †.
Fig. 2.
Fig. 2.
Comparisons of the Siglec gene cluster in human, chimpanzee, baboon, rat, and mouse. Order and arrangement of KLK-like genes (green triangles), the Siglec genes (red triangles), and pseudogenes (white triangles), as well as other genes (black triangles) in five genomes. The dotted lines represent regions (>30 kb in length) present in human genomes but absent in the chimpanzee and baboon genomes.
Fig. 3.
Fig. 3.
Molecular phylogenetic analyses of KLK-like molecules and Siglecs. Although human Siglec-11 and mouse Siglec-H genes are outside the cluster sequenced here, they were included in this analysis for comparison. (A) KLK-like molecules (full-length; 357 aa). Human KLKs1–3 were used together as an outgroup (not shown). Chimpanzee and baboon KLK6 were not included in the analysis because of insufficient sequence data for this gene. (B) Siglec N-terminal regions (signal peptide, Ig1 and Ig2, and linker peptide between Ig2 and Ig3; 291 aa). Primate Siglec-12/XII was excluded because it has two V-set domains and cannot be aligned with other Siglecs. (C) Siglec C-terminal regions (transmembrane domain and cytosolic tail; 184 aa). Rodent Siglec-3, Siglec-H, and primate Siglec-13 were excluded because these have much shorter cytoplasmic domains than other Siglecs. Amino acid sequences were used for calculating the distance matrix and reconstruction of the phylogenetic trees by using the neighbor-joining method. The bootstrap support value for each internode (as percent for 1,000 replications) is indicated above it. Siglec-4 was used as an outgroup for the trees in B and C. Note that Siglec-4 is not a CD33rSiglec but shows overall structural similarity to them, especially at the C-terminal region.

References

    1. Varki, A. (1993) Glycobiology 3, 97–130. - PMC - PubMed
    1. Gagneux, P. & Varki, A. (1999) Glycobiology 9, 747–755. - PubMed
    1. Baum, J., Ward, R. H. & Conway, D. J. (2002) Mol. Biol. Evol. 19, 223–229. - PubMed
    1. Crocker, P. R. & Varki, A. (2001) Trends Immunol. 22, 337–342. - PubMed
    1. Angata, T., Hingorani, R., Varki, N. M. & Varki, A. (2001) J. Biol. Chem. 276, 45128–45136. - PubMed

Publication types

Substances