Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 28;13(8):e1005693.
doi: 10.1371/journal.pcbi.1005693. eCollection 2017 Aug.

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Affiliations

HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes

Idan Alter et al. PLoS Comput Biol. .

Abstract

The major histocompatibility complex (MHC) contains the most polymorphic genetic system in humans, the human leukocyte antigen (HLA) genes of the adaptive immune system. High allelic diversity in HLA is argued to be maintained by balancing selection, such as negative frequency-dependent selection or heterozygote advantage. Selective pressure against immune escape by pathogens can maintain appreciable frequencies of many different HLA alleles. The selection pressures operating on combinations of HLA alleles across loci, or haplotypes, have not been extensively evaluated since the high HLA polymorphism necessitates very large sample sizes, which have not been available until recently. We aimed to evaluate the effect of selection operating at the HLA haplotype level by analyzing HLA A~C~B~DRB1~DQB1 haplotype frequencies derived from over six million individuals genotyped by the National Marrow Donor Program registry. In contrast with alleles, HLA haplotype diversity patterns suggest purifying selection, as certain HLA allele combinations co-occur in high linkage disequilibrium. Linkage disequilibrium is positive (Dij'>0) among frequent haplotypes and negative (Dij'<0) among rare haplotypes. Fitting the haplotype frequency distribution to several population dynamics models, we found that the best fit was obtained when significant positive frequency-dependent selection (FDS) was incorporated. Finally, the Ewens-Watterson test of homozygosity showed excess homozygosity for 5-locus haplotypes within 23 US populations studied, with an average Fnd of 28.43. Haplotype diversity is most consistent with purifying selection for HLA Class I haplotypes (HLA-A, -B, -C), and was not inferred for HLA Class II haplotypes (-DRB1 and-DQB1). We discuss our empirical results in the context of evolutionary theory, exploring potential mechanisms of selection that maintain high linkage disequilibrium in MHC haplotype blocks.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Methodology validation.
(A)—Fraction of discordance between the computed most probable haplotype for an individual based on low resolution typing and the haplotype measured from high resolution typing. The discordance level is computed as a function of the low resolution typing haplotype frequency. (B) Quantile-Quantile (QQ) comparison of the frequency distribution of the most probable haplotype per individual as computed from low resolution typing and the parallel from high resolution typing. The distribution was computed over all donors with both high and low resolution typing. The x-axis is the frequency of a haplotype at the kth quantile in the low resolution based EM estimate, and the y-axis is the parallel in the high resolution typing. Values on the diagonal imply a similar cumulative distribution function (CDF) (C) The observed haplotypes were logarithmically binned by their low-res frequency and the geometric means of both the low and high resolution frequencies were calculated, resulting in largely the same values for each bin. (D) Quantile-Quantile comparison of the haplotype distribution in the same patients between phased and un-phased genotypes.
Fig 2
Fig 2. Fnd values of haplotypes and allele frequencies.
The difference between expected and observed homozygosity, as defined by the Fnd of alleles (A plot), haplotypes (C plot) and allele combination (B plot) frequency distributions, calculated from subsamples of 1,200 individuals in 4 broad race groups (AFA—African American, API—Asian and Pacific Islander, CAU—European (Caucasian) and HIS—Hispanic). Single-locus Fnd values are negative (lower observed than expected homozygosity) in most populations, indicating that observed homozygosity exceeds that expected for a constant-size neutrally evolving population (upper plot). Contrary to the single-locus results, Fnd values of full 5-locus haplotypes are positive, denoting multi-locus homozygosity above expectation (C plot). This extra homozygosity holds most strongly for Class I loci, versus Class II loci. The Fnd of two-locus Class I haplotypes containing the HLA-A locus show a positive Fnd value (B plot).
Fig 3
Fig 3. Linkage disequilibrium patterns.
Mean normalized Dij' (using the Lewontin normalization) as a function of the frequency for four broad race populations.Dij is a normalized measure of linkage disequilibrium (LD) taking values between -1 and 1 (maximal LD) and in neutrality should be 0. Each allele/haplotype-pair was assigned its combined frequency and a Dij' value and the Dij' values were averaged over all allele pairs within the same frequency bin in a given population. One can observe that for all allele pairs, the average Dij' values are zero for very low frequencies, null or negative for intermediate frequencies and highly positive for high frequencies. This pattern of Dij' values indicates that frequent allele pairs are much more frequent than expected by the frequencies of their components (e.g. p(AB>P(A)P(B) for high P(AB) values, but P(AB)<P(A)P(B) for intermediate P(AB) values).
Fig 4
Fig 4. Comparison of frequency models.
(A) The difference between Bayesian Information Criterion (BIC) values of the maximum likelihood estimate (MLE) fit of the Yule model and the BDIM. The models were fitted to frequency distributions of each allele separately, as well as to the full 5-locus haplotype frequency distribution. The calculation was repeated for all 18 detailed race groups and 5 broad race groups (black dots). The red line represents the median over the 23 populations, while the blue lines represent the other quantiles. (B) The selection parameter δ of the MLE fit to the BDIM. This parameter represents the net reproductive disadvantage of the rare frequencies. Thus, positive values suggest a disadvantage to rare alleles/haplotypes.

Similar articles

Cited by

References

    1. Beck S, Geraghty D, Inoko H, Rowen L (1999) Complete sequence and gene map of a human major histocompatibility complex. Nature 401: 921–923. doi: 10.1038/44853 - DOI - PubMed
    1. Consortium TMHCs (1999) Complete sequence and gene map of a human major histocompatibility complex. Nature 401: 921–923. doi: 10.1038/44853 - DOI - PubMed
    1. Yeager M, Hughes AL (1999) Evolution of the mammalian MHC: natural selection, recombination, and convergent evolution. Immunological Reviews 167: 45–58. - PubMed
    1. Carrington M, O'Brien SJ (2003) The Influence of HLA Genotype on AIDS*. Annual review of medicine 54: 535–551. doi: 10.1146/annurev.med.54.101601.152346 - DOI - PubMed
    1. Zozulya S, Echeverri F, Nguyen T (2001) The human olfactory receptor repertoire. Genome biology 2: 1. - PMC - PubMed

Substances