Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 17:19:100659.
doi: 10.1016/j.bbrep.2019.100659. eCollection 2019 Sep.

Missense mutations in the C-terminal portion of the B4GALNT2-encoded glycosyltransferase underlying the Sd(a-) phenotype

Affiliations

Missense mutations in the C-terminal portion of the B4GALNT2-encoded glycosyltransferase underlying the Sd(a-) phenotype

Linn Stenfelt et al. Biochem Biophys Rep. .

Abstract

Sda is a high-frequency carbohydrate histo-blood group antigen, GalNAcβ1-4(NeuAcα2-3)Galβ, implicated in pathogen invasion, cancer, xenotransplantation and transfusion medicine. Complete lack of this glycan epitope results in the Sd(a-) phenotype observed in 4% of individuals who may produce anti-Sda. A candidate gene (B4GALNT2), encoding a Sda-synthesizing β-1,4-N-acetylgalactosaminyltransferase (β4GalNAc-T2), was cloned in 2003 but the genetic basis of human Sda deficiency was never elucidated. Experimental and bioinformatic approaches were used to identify and characterize B4GALNT2 variants in nine Sd(a-) individuals. Homozygosity for rs7224888:T > C dominated the cohort (n = 6) and causes p.Cys466Arg, which targets a highly conserved residue located in the enzymatically active domain and is judged deleterious to β4GalNAc-T2. Its allele frequency was 0.10-0.12 in different cohorts. A Sd(a-) compound heterozygote combined rs7224888:T > C with a splice-site mutation, rs72835417:G > A, predicted to alter splicing and occurred at a frequency of 0.11-0.12. Another compound heterozygote had two rare nonsynonymous variants, rs148441237:A > G (p.Gln436Arg) and rs61743617:C > T (p.Arg523Trp), in trans. One sample displayed no differences compared to Sd(a+). When investigating linkage disequilibrium between B4GALNT2 variants, we noted a 32-kb block spanning intron 9 to the intergenic region downstream of B4GALNT2. This block includes RP11-708H21.4, a long non-coding RNA recently reported to promote tumorigenesis and poor prognosis in colon cancer. The expression patterns of B4GALNT2 and RP11-708H21.4 correlated extremely well in >1000 cancer cell lines. In summary, we identified a connection between variants of the cancer-associated B4GALNT2 gene and Sda, thereby establishing a new blood group system and opening up for the possibility to predict Sd(a+) and Sd(a‒) phenotypes by genotyping.

Keywords: B4GALNT2; Glycosyltransferase; Red blood cell; Sd(a‒) phenotype; Sda histo-blood group antigen.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The rs7224888 and additional SNPs that correlates with Sd(a) phenotype. (A) Human B4GALNT2 on chromosome 17 encodes three transcripts, differing in sequence due to differential use of exon 1; long (red box), short (blue box) or, a so far only theoretical, middle length exon 1 (green box) as stated in Ensembl release 96 [39]. All transcripts use the same coding sequence of exons 2–11 (black boxes), while the UTR of exon 11 differs (colored accordingly). Black horizontal lines in between the exons depict the introns. (B) The SNPs identified in nine individuals with the Sd(a−) phenotype, after sequencing the coding regions and the proposed promoter of the gene. Magnified sequences surrounding the SNPs (underlined) are shown at the stated nucleotide positions and below follows the SNP status of each subject symbolized by stick figures. The variants of interest are written in white. For one of the rs7224888 homozygotes, nucleotide status for rs148441237 was not established. (C) Schematic sketch of the translated proteins from each transcript. Depending on which exon 1 is utilized the product is predicted to encode a transmembrane (NP_703147 and NP_001152859) or a soluble (NP_001152860) glycosyltransferase. The amino acid changes caused by the identified SNPs are found C-terminally of the DXD motif (in this case three consecutive aspartic acids, DDD) and are described in the dark grey boxes on the right and the amino acid positions are stated for each isoform.
Fig. 2
Fig. 2
The 3D protein structure of β-1,4-N-acetylgalactosaminyltransferase 2, based on homology modelling of the crystal structure of chondroitin synthase from E. coli, using SWISS-MODEL [32]. (A) The model consists of amino acids 319–524 in the catalytic domain and is colored by residue number from red in the N-terminal via yellow to blue in the C-terminal. The protein is predicted to form a dimer, as is common for glycosyltransferases. (B) A close-up view of the structure detailing the locations of the DXD motif and the three SNPs of interest in relation to the Sd(a−) phenotype.
Fig. 3
Fig. 3
The rs7224888 is located in a highly conserved region among different species. Homologous proteins with significant amino acid alignments based on Protein BLAST of amino acids 451–481. Sequence identity represents similarity with human β-1,4-N-acetylgalactosaminyltransferase 2 (top row) in the analyzed region. Start refers to the position of the first amino acid shown on each line. The red frame shows the amino acid altered by rs7224888 (p.Cys466Arg), and the corresponding amino acids in the related proteins. Light blue, middle blue, and dark blue boxes symbolize identity between 9 and 10, 11–12, or all 13 of the compared sequences, respectively. The figure shows that a highly homologous region can be found in β-1,4-N-acetylgalactosaminyltransferase 1 in many species, and that the cysteine at position 466, as well as many surrounding residues, is conserved in all proteins.
Fig. 4
Fig. 4
The rs7224888 resides in a haplotype block of ~32 kb (A) Linkage disequilibrium (LD) between rs7224888 and other variants in B4GALNT2 and neighboring genes. Each dot represents a variant detected in the 1000 Genomes Project [23] and is color-coded according to its location. The long transcript (ENST00000300404) was used for color-coding variants in B4GALNT2. Variants in AC069454.1 were coded as intronic. The x-axis shows the chromosomal location according to the GRCh37/hg19 human reference genome and the y-axis shows the level of LD with rs7224888 (R2). (B) The B4GALNT2 transcripts and the canonical transcripts for neighboring genes ABI3, GNGT2 and PHOSPO1. AC069454.1 is a ribosomal protein S10 (RPS10) pseudogene and RP11-708H21.4 is long non-coding RNA (lncRNA). The orange shaded area represents a presumed haplotype of ~32 kb, where 66 variants exhibit strong LD (R2 > 0.6) with rs7224888. This haplotype consists of exon 10 and 11 in all B4GALNT2 transcripts as well as RP11-708H21.4. The two other exonic variants in this haplotype are synonymous (rs16946912) or located in the 3′ UTR (rs28689968).

References

    1. Renton P.H., Howell P., Ikin E.W. Anti-Sda, a new blood group antibody. Vox Sang. 1967;13:493–501.
    1. Macvie S.I., Morton J.A., Pickles M.M. The reactions and inheritance of a new blood group antigen, Sda. Vox Sang. 1967;13:485–492.
    1. Reid M.E., Lomas-Francis C., Olsson M.L. third ed. Academic Press; London, UK: 2012. The Blood Group Antigen FactsBook.
    1. Morton J.A., Pickles M.M., Terry A.M. The Sda blood group antigen in tissues and body fluids. Vox Sang. 1970;19:472–482. - PubMed
    1. Blanchard D., Capon C., Leroy Y. Comparative study of glycophorin A derived O-glycans from human Cad, Sd(a+) and Sd(a-) erythrocytes. Biochem. J. 1985;232:813–818. - PMC - PubMed

LinkOut - more resources