Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 6;8(6):e64683.
doi: 10.1371/journal.pone.0064683. Print 2013.

Imputing amino acid polymorphisms in human leukocyte antigens

Affiliations

Imputing amino acid polymorphisms in human leukocyte antigens

Xiaoming Jia et al. PLoS One. .

Abstract

DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Overview of the SNP2HLA imputation procedure.
The reference panel (top) contains SNPs in the MHC, classical HLA alleles at the class I and class II loci, and amino acid sequences corresponding to the 4-digit HLA types at each locus. For a data set with genotyped SNPs across the MHC (bottom), we use the reference panel to impute classical alleles and their corresponding amino acid polymorphisms.
Figure 2
Figure 2. Correlation between imputed and typed dosages (r 2 dosage) of classical HLA alleles in the B58BC as a function of typed allele frequency for imputation from the (a) Affymetrix 500 K or (b) Illumina Immunochip platform using the HapMap-CEPH reference panel, and imputation from the (c) Affymetrix 500 K or (d) Illumina Immunochip platform using the T1DGC reference panel.
Black points indicate 2-digit HLA alleles. Red points indicate 4-digit HLA alleles.
Figure 3
Figure 3. Correlation between imputed and typed dosages (r 2 dosage) of polymorphic amino acids in the B58BC as a function of typed allele frequency for imputation from the (a) Affymetrix 500 K or (b) Illumina Immunochip platform using the HapMap-CEPH reference panel, and imputation from the (c) Affymetrix 500 K or (d) Illumina Immunochip platform using the T1DGC reference panel.
Black points indicate bi-allelic positions. Red points indicate poly-allelic positions.
Figure 4
Figure 4. Association analysis of WTCCC type 1 diabetes data.
We imputed classical HLA alleles and polymorphic amino acids in 1,963 cases and 2,939 controls using the T1DGC reference panel, and tested all variants for association with logistic regression. Of all variants tested, the top hit maps to amino acid position 57 in HLA-DQβ1, consistent with a previous study .
Figure 5
Figure 5. Haplotype risk analysis of WTCCC type 1 diabetes data.
We assessed the risk of haplotypes spanning HLA-DRB1, HLA-DQA1 and HLA-DQB1, and compared these to the published risk estimates from an independent study . The published odds ratios were based on transmission/non-transmission of alleles from familial data, while our odds ratios were estimated from case/control data. We used the same classification scheme by dividing haplotypes into three risk groups. The odds ratios are computed with respect to the DRB1*01-DQA1*0101-DQB1*0501 haplotype.

References

    1. Horton R, Wilming L, Rand V, Lovering RC, Bruford EA, et al. (2004) Gene map of the extended human MHC. Nature reviews Genetics 5: 889–899. - PubMed
    1. Carrington M, O’Brien SJ (2003) The influence of HLA genotype on AIDS. Annu Rev Med 54: 535–551. - PubMed
    1. Fernando MM, Stevens CR, Walsh EC, De Jager PL, Goyette P, et al. (2008) Defining the role of the MHC in autoimmunity: a review and pooled analysis. PLoS Genet 4: e1000024. - PMC - PubMed
    1. Morishima S, Ogawa S, Matsubara A, Kawase T, Nannya Y, et al. (2010) Impact of highly conserved HLA haplotype on acute graft-versus-host disease. Blood 115: 4664–4670. - PubMed
    1. Bharadwaj M, Illing P, Theodossis A, Purcell AW, Rossjohn J, et al. (2012) Drug hypersensitivity and human leukocyte antigens of the major histocompatibility complex. Annu Rev Pharmacol Toxicol 52: 401–431. - PubMed

Publication types

MeSH terms