Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 24:11:603980.
doi: 10.3389/fimmu.2020.603980. eCollection 2020.

Poorly Expressed Alleles of Several Human Immunoglobulin Heavy Chain Variable Genes are Common in the Human Population

Affiliations

Poorly Expressed Alleles of Several Human Immunoglobulin Heavy Chain Variable Genes are Common in the Human Population

Mats Ohlin. Front Immunol. .

Abstract

Extensive diversity has been identified in the human heavy chain immunoglobulin locus, including allelic variation, gene duplication, and insertion/deletion events. Several genes have been suggested to be deleted in many haplotypes. Such findings have commonly been based on inference of the germline repertoire from data sets covering antibody heavy chain encoding transcripts. The inference process operates under conditions that may limit identification of genes transcribed at low levels. The presence of rare transcripts that would indicate the existence of poorly expressed alleles in haplotypes that otherwise appear to have deleted these genes has been assessed in the present study. Alleles IGHV1-2*05, IGHV1-3*02, IGHV4-4*01, and IGHV7-4-1*01 were all identified as being expressed from multiple haplotypes, but only at low levels, haplotypes that by inference often appeared not to express these genes at all. These genes are thus not as commonly deleted as previously thought. An assessment of the 5' untranslated region (up to and including the TATA-box), the signal peptide-encoding part of the gene, and the 3'-heptamer suggests that the alleles have no or minimal sequence difference in these regions in comparison to highly expressed alleles. This suggest that they may be able to participate in immunoglobulin gene rearrangement, transcription and translation. However, all four poorly expressed alleles harbor unusual sequence variants within their coding region that may compromise the functionality of the encoded products, thereby limiting their incorporation into the immunoglobulin repertoire. Transcripts based on IGHV7-4-1*01 that had undergone somatic hypermutation and class switch had mutated the codon that encoded the unusual residue in framework region 3 (cysteine 92; located far from the antigen binding site). This finding further supports the poor compatibility of this unusual residue in a fully functional protein product. Indications of a linkage disequilibrium were identified as IGHV1-2*05 and IGHV4-4*01 co-localized to the same haplotypes. Furthermore, transcripts of two of the poorly expressed alleles (IGHV1-3*02 and IGHV4-4*01) mostly do not encode in-frame, functional products, suggesting that these alleles might be essentially non-functional. It is proposed that the functionality status of immunoglobulin genes should also include assessment of their ability to encode functional protein products.

Keywords: adaptive immune receptor repertoire; allelic diversity; antibody heavy chain; germline gene; haplotype; immunoglobulin; inference; next generation sequencing.

PubMed Disclaimer

Conflict of interest statement

MO is a member of the Adaptive Immune Receptor Repertoire (AIRR) Community’s Germline Database Working Group, and its Inferred Allele Review Committee. The Committee defines processes for approval of alleles of immunoglobulin gene alleles identified through computational inference, and that also approves inferences of such alleles.

Figures

Figure 1
Figure 1
Summary of analysis process and output files. Raw sequence data was obtained from the European Nucleotide Archive (ENA) and processed by IgDiscover to generate an inferred genotype of each subject. The assembled sequence files that were generated during the IgDiscover process were used to define (1) the closest germline gene, (2) the perceived functionality, (3) the association to alleles of IGHJ6 (for haplotype analysis), and (4) the length of the encoded CDR3, of each read as assessed by IMGT/HighV-QUEST. Output files generated after analysis by TIgGER (7, 10) and RAbHIT (18) (to define the IGHV genotype of investigated individuals, and haplotypes as defined by association of reads to alleles of IGHJ6, respectively) were obtained directly from the VDJbase portal (https://vdjbase.org).
Figure 2
Figure 2
The number of reads of alleles likely present in the germline repertoire of 35 subjects ( Table 1 ) associated to the two different alleles of IGHJ6 of the genotype. (A) Three data sets that express 1–2 alleles of IGHV1-2 other than IGHV1-2*05 (left part of panel), and six data sets that express IGHV1-2*05. (B) One data set that is homozygous for IGHV1-3*01 (left part of panel) and 21 data sets that express IGHV1-3*02, 19 which also contain reads associated to IGHJ6. (C) One data set that expresses two different alleles of IGHV4-4 other than IGHV4-4*01 (left part of panel), and six data sets that express IGHV4-4*01, five which also contain reads associated to IGHJ6. (D) One data set that is homozygous for IGHV7-4-1*02 (left part of panel), and 23 data sets that express IGHV7-4-1*01, 22 which also contain such reads associated to IGHJ6. In all cases, haplotype 1 represents the haplotype with the IGHJ6 allele with the lowest alphanumeric name in the data set in question [in all cases but one (ERR2567242) this allele is IGHJ6*02]. Only reads that by IMGT/HighV-QUEST analysis were uniquely associated to a single IGHV allele and a single allele of IGHJ6 were used in the calculation to generate this illustration.
Figure 3
Figure 3
The number of nucleotides of the CDR3-encoding part of reads derived from IGHV1-2*05 (A), IGHV1-3*02 (B), IGHV4-4*01 (C), and IGHV7-4-1*01 (D) extracted from all donors that use these genes. The distribution of lengths of bases are compared to those of IGHV1-2*02 (data set ERR2567226), IGHV1-3*01 (data set ERR2567206), IGHV4-4*02 (data set ERR2567192), and IGHV7-4-1*02 (data set ERR2567206), respectively.

References

    1. Henry Dunand CJ, Wilson PC. Restricted, canonical, stereotyped and convergent immunoglobulin responses. Philos Trans R Soc Lond B Biol Sci (2015) 370:20140238. 10.1098/rstb.2014.0238 - DOI - PMC - PubMed
    1. Avnir Y, Watson CT, Glanville J, Peterson EC, Tallarico AS, Bennett AS, et al. . IGHV1-69 polymorphism modulates anti-influenza antibody repertoires, correlates with IGHV utilization shifts and varies by ethnicity. Sci Rep (2016) 6:20842. 10.1038/srep20842 - DOI - PMC - PubMed
    1. Collins AM, Yaari G, Shepherd AJ, Lees W, Watson CT. Germline immunoglobulin genes: disease susceptibility genes hidden in plain sight? Curr Opin Syst Biol (2020) 24:100–8. 10.1016/j.coisb.2020.10.011 - DOI - PMC - PubMed
    1. Giudicelli V, Chaume D, Lefranc MP. IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res (2005) 33:D256–261. 10.1093/nar/gki010 - DOI - PMC - PubMed
    1. Wang Y, Jackson KJ, Sewell WA, Collins AM. Many human immunoglobulin heavy-chain IGHV gene polymorphisms have been reported in error. Immunol Cell Biol (2008) 86:111–5. 10.1038/sj.icb.7100144 - DOI - PubMed

Publication types

Substances