Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 7;112(8):1833-1851.
doi: 10.1016/j.ajhg.2025.06.011. Epub 2025 Jul 10.

Haplotype analysis reveals pleiotropic disease associations in the HLA region

Affiliations

Haplotype analysis reveals pleiotropic disease associations in the HLA region

Courtney J Smith et al. Am J Hum Genet. .

Abstract

The human leukocyte antigen (HLA) region plays an important role in human health through its involvement in immune cell recognition and maturation. While genetic variation in the HLA region is associated with many diseases, the pleiotropic patterns of these associations have not been systematically investigated. Here, we developed a haplotype approach to investigate disease associations phenome wide for 412,181 Finnish individuals and 2,459 diseases. Across the 1,035 diseases with a genome-wide association study association, we found a 17-fold average per-SNP enrichment of hits in the HLA region. Altogether, we identified 7,649 HLA associations across 647 diseases, including 1,750 associations uncovered by haplotype analysis. We found that some haplotypes show both risk-increasing and protective associations across different diseases, while others consistently increase risk across diseases, indicating a complex pleiotropic landscape involving a range of diseases. This study highlights the extensive impact of HLA variation on disease risk and underscores the importance of classical and non-classical genes as well as non-coding variation.

Keywords: FinnGen; GWAS; HLA; UKBB; biobank; genomics; phenome-wide; summary statistics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1
Figure 1
Study overview (A) An overview of the HLA region showing the nearest genes to disease-associated SNPs, colored by HLA class, spanning approximately 5 Mb. (B) An overview of the study data and design.
Figure 2
Figure 2
Distribution of GWAS hits across the genome and disease group enrichment (A) Distribution of fine-mapped GWAS hits throughout the genome across 1,035 FinnGen diseases, binned into 100-kb bins. (B) Enrichment of association signal in the HLA region by disease group. The 1,035 diseases were categorized into 45 disease groups based on ICD codes, and the average per-SNP enrichment in the HLA region was calculated by comparing the number of independent associations in the HLA region relative to that in the rest of the genome. (C) Classification of diseases with at least one significant SNP association in the HLA region by shared pathophysiology.
Figure 3
Figure 3
Pleiotropic structure of the HLA region (A) Distribution of significant SNP associations across the HLA region, binned by nearest gene. Each bar represents a different gene, and the width corresponds to the length of the gene boundaries. (B) Heatmap of normalized Z scores for the 428 variants in the HLA region significantly associated with at least one disease. The x axis has SNPs projected on a pseudo-genome position scale, roughly proportional to the genome position of the variant, and the y axis corresponds to the HLA-associated diseases. Associations with all HLA-associated diseases are shown for all variants that had an independent significant association with at least one disease. The three blocks used in subsequent analysis are circled, and a schematic labeling key genes in and between each block are included. Simplified names for key diseases or groups of diseases are labeled in the schematic on the y axis with the length of the bar roughly proportional to the number of diseases the label is representing. A full description of all diseases can be found in Table S1. (C) Linkage disequilibrium as measured by r2 and D of the approximately 40,000 SNPs covering the HLA region (MAF > 1%).
Figure 4
Figure 4
Haplotype group regression analysis pipeline Overview of the pipeline for identifying the haplotype groups for each of the three blocks in the HLA region and performing disease associations. For each block, all unique phased combinations of nucleotides at 1,000 randomly selected SNPs were considered as haplotypes. We then clustered related haplotypes into groups by recursively splitting the dendrogram at each branchpoint (see subjects and methods). Finally, for each of the three blocks, we performed association analyses between the haplotype groups and the 269 HLA-associated diseases, including all haplotype groups for a given block except the most frequent in each regression, as well as sex, age, and the first ten principal components of the genome-wide genotype matrix as covariates.
Figure 5
Figure 5
Haplotype group regression results A dendrogram showing the clustering of the 40 most frequent haplotypes per haplotype group, with white representing the reference allele and black representing the effect allele. Genes are labeled below the corresponding SNPs overlapping their genome position, indicating which are within gene boundaries and which are intergenic. Heatmap showing the Z scores from the haplotype group regression analysis across associated diseases for (A) block 1, (B) block 2, and (C) block 3, including all diseases with at least one association |Z| > 4 in that block, and all haplotype groups with at least one disease association or total copies greater than the minimum cutoff of 20,000 copies. For visualization purposes, diseases are clustered, and Z scores were set to a maximum |Z| of 5. MS, multiple sclerosis; CTDs, connective tissue diseases; SLE, systemic lupus erythematosus; RA, rheumatoid arthritis; Rx, medical prescription; T1D, type 1 diabetes; CKD, chronic kidney disease; STD, sexually transmitted disease. See Table S1 for the full description of all diseases.
Figure 6
Figure 6
Correlation of haplotype-associated diseases (A) Overview and comparison of the pairwise relationships between diseases that were significantly associated with the haplotype group regression analysis, comparing genome-wide LDSC genetic correlations, Pearson’s correlation across haplotype groups in each block, and phenotypic correlations. (B) Comparison of correlation measures between Graves disease and rheumatoid arthritis. Error bars correspond to standard error.

Update of

References

    1. Horton R., Wilming L., Rand V., Lovering R.C., Bruford E.A., Khodiyar V.K., Lush M.J., Povey S., Talbot C.C., Wright M.W., et al. Gene map of the extended human MHC. Nat. Rev. Genet. 2004;5:889–899. doi: 10.1038/nrg1489. - DOI - PubMed
    1. Neefjes J., Jongsma M.L.M., Paul P., Bakke O. Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat. Rev. Immunol. 2011;11:823–836. doi: 10.1038/nri3084. - DOI - PubMed
    1. Ishigaki K., Lagattuta K.A., Luo Y., James E.A., Buckner J.H., Raychaudhuri S. HLA autoimmune risk alleles restrict the hypervariable region of T cell receptors. Nat. Genet. 2022;54:393–402. doi: 10.1038/s41588-022-01032-z. - DOI - PMC - PubMed
    1. Fan W.-L., Shiao M.-S., Hui R.C.-Y., Su S.-C., Wang C.-W., Chang Y.-C., Chung W.-H. HLA Association with Drug-Induced Adverse Reactions. J. Immunol. Res. 2017;2017 doi: 10.1155/2017/3186328. - DOI - PMC - PubMed
    1. Parham P., Guethlein L.A. Genetics of Natural Killer Cells in Human Health, Disease, and Survival. Annu. Rev. Immunol. 2018;36:519–548. doi: 10.1146/annurev-immunol-042617-053149. - DOI - PubMed

LinkOut - more resources