Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb;85(3):1310-21.
doi: 10.1128/JVI.01966-10. Epub 2010 Nov 17.

Mapping the landscape of host-pathogen coevolution: HLA class I binding and its relationship with evolutionary conservation in human and viral proteins

Affiliations

Mapping the landscape of host-pathogen coevolution: HLA class I binding and its relationship with evolutionary conservation in human and viral proteins

Tomer Hertz et al. J Virol. 2011 Feb.

Abstract

The high diversity of HLA binding preferences has been driven by the sequence diversity of short segments of relevant pathogenic proteins presented by HLA molecules to the immune system. To identify possible commonalities in HLA binding preferences, we quantify these using a novel measure termed "targeting efficiency," which captures the correlation between HLA-peptide binding affinities and the conservation of the targeted proteomic regions. Analysis of targeting efficiencies for 95 HLA class I alleles over thousands of human proteins and 52 human viruses indicates that HLA molecules preferentially target conserved regions in these proteomes, although the arboviral Flaviviridae are a notable exception where nonconserved regions are preferentially targeted by most alleles. HLA-A alleles and several HLA-B alleles that have maintained close sequence identity with chimpanzee homologues target conserved human proteins and DNA viruses such as Herpesviridae and Adenoviridae most efficiently, while all HLA-B alleles studied efficiently target RNA viruses. These patterns of host and pathogen specialization are both consistent with coevolutionary selection and functionally relevant in specific cases; for example, preferential HLA targeting of conserved proteomic regions is associated with improved outcomes in HIV infection and with protection against dengue hemorrhagic fever. Efficiency analysis provides a novel perspective on the coevolutionary relationship between HLA class I molecular diversity, self-derived peptides that shape T-cell immunity through ontogeny, and the broad range of viruses that subsequently engage with the adaptive immune response.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Computing allele efficiency scores. (A) A representative illustration of an MHC molecule (yellow) binding to a segment of the HIV-1 Gag protein, showing comparisons with the phylogenetically related simian immunodeficiency virus (SIV) Gag and feline immunodeficiency virus (FIV) Gag proteins. While the topology of the phylogenetic tree is shared among the protein sites, the evolutionary rates, indicated by the variation of color of the branches from red (most conserved) to blue (least conserved), may vary dramatically. (B and C) The allele targeting efficiency score (r) for a given protein (herpesvirus-1 capsid triplex subunit 1 in this case) is defined by the rank correlation coefficient between site conservation scores (evolutionary rate) and HLA site binding scores (an average binding energy for peptides containing the site), along the protein (HLA-A*2402 here), assessed at each amino acid position. (D) Distributions of HLA-A and HLA-B locus efficiency scores for a range of human viruses. Each point represents an HLA allele-specific efficiency for the relevant full-length viral proteome. Blue bars represent locus means. P values indicate significance of locus differences tested using mixed effects analysis (Fig. 5 and Table 1). Adeno, adenovirus; Flu, influenza virus; Hep B, hepatitis B virus; HSV 2, herpes simplex virus type 1.
FIG. 2.
FIG. 2.
Comparative histograms of human and random protein efficiency scores. These histograms demonstrate preferential targeting of evolutionarily conserved self-peptides by class I HLA molecules. The distribution of human efficiency scores is statistically significantly higher than that of random proteins (P = 5.4 × 10−77).
FIG. 3.
FIG. 3.
Mean efficiency scores of 4,761 human (OMIM) proteins and 3,800 random (UniRef50) proteins. The P values are based on mixed effects analysis (see the supplemental material).
FIG. 4.
FIG. 4.
Allele efficiency scores for OMIM human proteins by HLA supertype groups. HLA efficiency scores of 95 HLA alleles grouped by supertypes are shown for the a set of 4,761 human proteins that form the OMIM database. As can be seen, HLA-A alleles have higher efficiency scores than HLA-B alleles, with the exception of the B58 supertype.
FIG. 5.
FIG. 5.
Heat map distribution of allele efficiencies for human viruses and human proteins (x axis) by HLA supertype families (y axis). A matrix of efficiency scores computed for each of the 95 HLA alleles studied for 52 human viruses and a set of human proteins. Each entry in this efficiency matrix represents the efficiency score of a specific HLA allele (y axis) for a specific viral proteome. HLA alleles were grouped by supertypes, and human viruses were grouped by viral families and by Baltimore classification. Average efficiency scores over a large set of human proteins are presented in the bar to the left of the matrix. Distinct patterns of targeting efficiency can be observed for both HLA alleles (grouped by supertype or loci) and for different viral groups and families. UC, unclassified alleles that have not been assigned to supertypes; HSV-1, herpes simplex virus type 1; EBV, Epstein-Barr virus; CMV, cytomegalovirus; KSHV, Kaposi's sarcoma-associated herpesvirus; SARs-CoV, severe acute respiratory syndrome coronavirus; HTLV-1, human T-cell leukemia virus type 1; ssRNA, single-stranded RNA; RT, reverse transcriptase.
FIG. 6.
FIG. 6.
Comparison of efficiency scores of human and plant DNA (A) viruses and human and plant RNA viruses (B) utilizing mixed effects analysis. The P values are based on mixed effects analysis (see supplemental material).
FIG. 7.
FIG. 7.
Correlations between allele efficiency scores for human proteins and human viruses. (A) Correlation (Spearman rank) between allele efficiency scores for human proteins, cytomegalovirus (herpesvirus), and dengue virus (flavivirus) (each representing one column in Fig. 5). (B) Frequency distributions of correlation coefficients between proteomes derived from randomized HLA alleles (n = 10,000) compared with actual values from panel A, as indicated with arrows. (C) Correlation matrix of efficiency scores: human and viral proteomes (the three scores from panel A are dots with appropriate intensity in this matrix). The extent to which HLA efficiency scores are correlated between human viruses, as well as self-peptides (extreme left column), is represented here according to Spearman rank correlation coefficient values. For abbreviations, see the legend of Fig. 5.
FIG. 8.
FIG. 8.
Targeting efficiencies for dengue virus (serotype 2, whole proteome), for all 95 analyzed HLA alleles. Each dot marks the efficiency score of a single HLA allele. Alleles are sorted by loci. Blue bars represent average locus efficiencies. HLA alleles previously associated with hemorrhagic fever are marked by squares, and those associated with protection are indicated with diamonds. Differences between the two groups were found to be significant (P = 0.05).
FIG. 9.
FIG. 9.
Targeting efficiencies for HIV-1 Gag protein for all 95 analyzed HLA alleles. Each dot marks the efficiency score of a single HLA allele. Alleles are sorted by loci. Blue bars represent average loci efficiencies. HLA alleles previously associated with slow HIV disease progression are marked by triangles, and those associated with rapid disease progression are indicated with squares. Alleles that have been associated with protection from infection are marked by diamonds.

References

    1. Antoniou, A. N., and S. J. Powis. 2008. Pathogen evasion strategies for the major histocompatibility complex class I assembly pathway. Immunology 124:1-12. - PMC - PubMed
    1. Berezin, C., et al. 2004. ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20:1322-1324. - PubMed
    1. Bhasin, M., and G. P. Raghava. 2007. A hybrid approach for predicting promiscuous MHC class I restricted T cell epitopes. J. Biosci. 32:31-42. - PubMed
    1. Bhattacharya, T., et al. 2007. Founder effects in the assessment of HIV polymorphisms and HLA allele associations. Science 315:1583-1586. - PubMed
    1. Borghans, J. A. M., J. B. Beltman, and R. J. De Boer. 2004. MHC polymorphism under host-pathogen coevolution. Immunogenetics 55:732-739. - PubMed

Substances