. 2017 Aug 28;13(8):e1005693.

doi: 10.1371/journal.pcbi.1005693. eCollection 2017 Aug.

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Idan Alter¹, Loren Gragert^{2

3}, Stephanie Fingerson², Martin Maiers², Yoram Louzoun¹

Affiliations

¹ Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
² National Marrow Donor Program, Minneapolis, Minnesota, United States of America.
³ Department of Pathology and Laboratory Medicine, Tulane University School of Medicine, New Orleans, Louisiana, United States of America.

PMID: 28846675
PMCID: PMC5590998
DOI: 10.1371/journal.pcbi.1005693

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Idan Alter et al. PLoS Comput Biol. 2017.

. 2017 Aug 28;13(8):e1005693.

doi: 10.1371/journal.pcbi.1005693. eCollection 2017 Aug.

Authors

Idan Alter¹, Loren Gragert^{2

3}, Stephanie Fingerson², Martin Maiers², Yoram Louzoun¹

Affiliations

¹ Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
² National Marrow Donor Program, Minneapolis, Minnesota, United States of America.
³ Department of Pathology and Laboratory Medicine, Tulane University School of Medicine, New Orleans, Louisiana, United States of America.

PMID: 28846675
PMCID: PMC5590998
DOI: 10.1371/journal.pcbi.1005693

Abstract

The major histocompatibility complex (MHC) contains the most polymorphic genetic system in humans, the human leukocyte antigen (HLA) genes of the adaptive immune system. High allelic diversity in HLA is argued to be maintained by balancing selection, such as negative frequency-dependent selection or heterozygote advantage. Selective pressure against immune escape by pathogens can maintain appreciable frequencies of many different HLA alleles. The selection pressures operating on combinations of HLA alleles across loci, or haplotypes, have not been extensively evaluated since the high HLA polymorphism necessitates very large sample sizes, which have not been available until recently. We aimed to evaluate the effect of selection operating at the HLA haplotype level by analyzing HLA A~C~B~DRB1~DQB1 haplotype frequencies derived from over six million individuals genotyped by the National Marrow Donor Program registry. In contrast with alleles, HLA haplotype diversity patterns suggest purifying selection, as certain HLA allele combinations co-occur in high linkage disequilibrium. Linkage disequilibrium is positive (Dij'>0) among frequent haplotypes and negative (Dij'<0) among rare haplotypes. Fitting the haplotype frequency distribution to several population dynamics models, we found that the best fit was obtained when significant positive frequency-dependent selection (FDS) was incorporated. Finally, the Ewens-Watterson test of homozygosity showed excess homozygosity for 5-locus haplotypes within 23 US populations studied, with an average Fnd of 28.43. Haplotype diversity is most consistent with purifying selection for HLA Class I haplotypes (HLA-A, -B, -C), and was not inferred for HLA Class II haplotypes (-DRB1 and-DQB1). We discuss our empirical results in the context of evolutionary theory, exploring potential mechanisms of selection that maintain high linkage disequilibrium in MHC haplotype blocks.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Methodology validation.**
(A)—Fraction of discordance between the computed most probable haplotype for an individual based on low resolution typing and the haplotype measured from high resolution typing. The discordance level is computed as a function of the low resolution typing haplotype frequency. (B) Quantile-Quantile (QQ) comparison of the frequency distribution of the most probable haplotype per individual as computed from low resolution typing and the parallel from high resolution typing. The distribution was computed over all donors with both high and low resolution typing. The x-axis is the frequency of a haplotype at the kth quantile in the low resolution based EM estimate, and the y-axis is the parallel in the high resolution typing. Values on the diagonal imply a similar cumulative distribution function (CDF) (C) The observed haplotypes were logarithmically binned by their low-res frequency and the geometric means of both the low and high resolution frequencies were calculated, resulting in largely the same values for each bin. (D) Quantile-Quantile comparison of the haplotype distribution in the same patients between phased and un-phased genotypes.

**Fig 2. Fnd values of haplotypes and allele frequencies.**
The difference between expected and observed homozygosity, as defined by the Fnd of alleles (A plot), haplotypes (C plot) and allele combination (B plot) frequency distributions, calculated from subsamples of 1,200 individuals in 4 broad race groups (AFA—African American, API—Asian and Pacific Islander, CAU—European (Caucasian) and HIS—Hispanic). Single-locus Fnd values are negative (lower observed than expected homozygosity) in most populations, indicating that observed homozygosity exceeds that expected for a constant-size neutrally evolving population (upper plot). Contrary to the single-locus results, Fnd values of full 5-locus haplotypes are positive, denoting multi-locus homozygosity above expectation (C plot). This extra homozygosity holds most strongly for Class I loci, versus Class II loci. The Fnd of two-locus Class I haplotypes containing the HLA-A locus show a positive Fnd value (B plot).

**Fig 3. Linkage disequilibrium patterns.**
Mean normalized D_ij' (using the Lewontin normalization) as a function of the frequency for four broad race populations.D_ij is a normalized measure of linkage disequilibrium (LD) taking values between -1 and 1 (maximal LD) and in neutrality should be 0. Each allele/haplotype-pair was assigned its combined frequency and a D_ij' value and the D_ij' values were averaged over all allele pairs within the same frequency bin in a given population. One can observe that for all allele pairs, the average D_ij' values are zero for very low frequencies, null or negative for intermediate frequencies and highly positive for high frequencies. This pattern of D_ij' values indicates that frequent allele pairs are much more frequent than expected by the frequencies of their components (e.g. p(AB>P(A)P(B) for high P(AB) values, but P(AB)<P(A)P(B) for intermediate P(AB) values).

**Fig 4. Comparison of frequency models.**
**(A)** The difference between Bayesian Information Criterion (BIC) values of the maximum likelihood estimate (MLE) fit of the Yule model and the BDIM. The models were fitted to frequency distributions of each allele separately, as well as to the full 5-locus haplotype frequency distribution. The calculation was repeated for all 18 detailed race groups and 5 broad race groups (black dots). The red line represents the median over the 23 populations, while the blue lines represent the other quantiles. **(B)** The selection parameter δ of the MLE fit to the BDIM. This parameter represents the net reproductive disadvantage of the rare frequencies. Thus, positive values suggest a disadvantage to rare alleles/haplotypes.

See this image and copyright information in PMC

Cited by

Reply to Hedrick and Klitz: High haplotype discovery rate in the HLA locus.
Louzoun Y, Lobkovsky AE, Levi L, Wolf YI, Maiers M, Gragert L, Alter I, Koonin EV. Louzoun Y, et al. Proc Natl Acad Sci U S A. 2019 Nov 19;116(47):23388-23389. doi: 10.1073/pnas.1916124116. Epub 2019 Oct 29. Proc Natl Acad Sci U S A. 2019. PMID: 31662470 Free PMC article. No abstract available.
High Resolution Class I HLA -A, -B, and -C Diversity in Eastern and Southern African Populations.
Banjoko AW, Ng'uni T, Naidoo N, Ramsuran V, Hyrien O, Ndhlovu ZM. Banjoko AW, et al. bioRxiv [Preprint]. 2024 Sep 8:2024.09.04.611164. doi: 10.1101/2024.09.04.611164. bioRxiv. 2024. Update in: Sci Rep. 2025 Jul 2;15(1):23667. doi: 10.1038/s41598-025-06704-4. PMID: 39282263 Free PMC article. Updated. Preprint.
Polygenic polymorphism is associated with NKG2A repertoire and influences lymphocyte phenotype and function.
Le Luduec JB, Kontopoulos T, Panjwani MK, Sottile R, Liu H, Schäfer G, Massalski C, Lange V, Hsu KC. Le Luduec JB, et al. Blood Adv. 2024 Oct 22;8(20):5382-5399. doi: 10.1182/bloodadvances.2024013508. Blood Adv. 2024. PMID: 39158076 Free PMC article.
Clinical settings in which human leukocyte antigen typing is still useful in the diagnosis of celiac disease.
Schirru E, Rossino R, Jores RD, Corpino M, Muntoni S, Cucca F, Congia M. Schirru E, et al. World J Gastroenterol. 2025 Apr 14;31(14):104397. doi: 10.3748/wjg.v31.i14.104397. World J Gastroenterol. 2025. PMID: 40248378 Free PMC article. Review.
MHC Haplotyping of SARS-CoV-2 Patients: HLA Subtypes Are Not Associated with the Presence and Severity of COVID-19 in the Israeli Population.
Ben Shachar S, Barda N, Manor S, Israeli S, Dagan N, Carmi S, Balicer R, Zisser B, Louzoun Y. Ben Shachar S, et al. J Clin Immunol. 2021 Aug;41(6):1154-1161. doi: 10.1007/s10875-021-01071-x. Epub 2021 May 29. J Clin Immunol. 2021. PMID: 34050837 Free PMC article.

See all "Cited by" articles

References

1. Beck S, Geraghty D, Inoko H, Rowen L (1999) Complete sequence and gene map of a human major histocompatibility complex. Nature 401: 921–923. doi: 10.1038/44853 - DOI - PubMed
1. Consortium TMHCs (1999) Complete sequence and gene map of a human major histocompatibility complex. Nature 401: 921–923. doi: 10.1038/44853 - DOI - PubMed
1. Yeager M, Hughes AL (1999) Evolution of the mammalian MHC: natural selection, recombination, and convergent evolution. Immunological Reviews 167: 45–58. - PubMed
1. Carrington M, O'Brien SJ (2003) The Influence of HLA Genotype on AIDS*. Annual review of medicine 54: 535–551. doi: 10.1146/annurev.med.54.101601.152346 - DOI - PubMed
1. Zozulya S, Echeverri F, Nguyen T (2001) The human olfactory receptor repertoire. Genome biology 2: 1. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Affiliations

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials