. 2007 Aug 29;2(8):e796.

doi: 10.1371/journal.pone.0000796.

NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence

Morten Nielsen¹, Claus Lundegaard, Thomas Blicher, Kasper Lamberth, Mikkel Harndahl, Sune Justesen, Gustav Røder, Bjoern Peters, Alessandro Sette, Ole Lund, Søren Buus

Affiliations

PMID: 17726526
PMCID: PMC1949492
DOI: 10.1371/journal.pone.0000796

NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence

Morten Nielsen et al. PLoS One. 2007.

. 2007 Aug 29;2(8):e796.

doi: 10.1371/journal.pone.0000796.

Authors

Morten Nielsen¹, Claus Lundegaard, Thomas Blicher, Kasper Lamberth, Mikkel Harndahl, Sune Justesen, Gustav Røder, Bjoern Peters, Alessandro Sette, Ole Lund, Søren Buus

Affiliation

¹ Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, Lyngby, Denmark. mniel@cbs.dtu.dk

PMID: 17726526
PMCID: PMC1949492
DOI: 10.1371/journal.pone.0000796

Abstract

Background: Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking.

Principal findings: Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis.

Conclusions: Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Prospective validation using hitherto uncharacterized HLA molecules.**
The upper figure gives the IC50 binding values for the sets of peptides identified by the *NetMHCpan* method to bind two hitherto uncharacterized HLA-A*8001, and HLA-A*7401 molecules. The peptides were selected as described in the text. 86% of the tested peptides bind stronger than 500 nM. The lower figure shows a Kullback-Leibler logo visualization of the HLA binding motifs as predicted by the *NetMHCpan* method. Peptide binders used to generate the logos for each HLA molecule were selected from a pool of 500,000 random natural nonamers using the *NetMHCpan* method with a binding threshold of 500 nM. The logos were generated with the logo program of Schneider and Stephens . Note that the binding motifs visualized in the logo plot are estimated from a set of approximately 5000 predicted binders, whereas the validated peptides only make up of the top 0.2%.

**Figure 2. Predictive performance of the *NetMHCpan* method as a function of the distance to its nearest neighbor HLA allele.**
The nearest neighbor distance is estimated from the alignment score of the HLA pseudo sequences using the relation , where s(A,B) is the BLOSUM50 alignment score between the pseudo sequences for alleles A and B, respectively. HLA-A alleles are shown as solid circles. HLA-B alleles are shown as +. The Pearson correlation coefficient between the pseudo sequence distance and the predictive performance for the 42 HLA alleles included in the plot is 0.67. Note, that the distance measure inherently assumes that all residues are equally important and independent of the pseudo sequence context. While this assumption is obviously inconsistent with the reality of primary anchors, it meets another essential requirement; it is simple and unbiased.

formula image — **Figure 2. Predictive performance of the *NetMHCpan* method as a function of the distance to its nearest neighbor HLA allele.**
The nearest neighbor distance is estimated from the alignment score of the HLA pseudo sequences using the relation , where s(A,B) is the BLOSUM50 alignment score between the pseudo sequences for alleles A and B, respectively. HLA-A alleles are shown as solid circles. HLA-B alleles are shown as +. The Pearson correlation coefficient between the pseudo sequence distance and the predictive performance for the 42 HLA alleles included in the plot is 0.67. Note, that the distance measure inherently assumes that all residues are equally important and independent of the pseudo sequence context. While this assumption is obviously inconsistent with the reality of primary anchors, it meets another essential requirement; it is simple and unbiased.

**Figure 3. HLA clustering from *NetMHCpan* predictions.**
The left hand panel shows the clustering for 36 representative HLA-A alleles, and the right hand panel the clustering for 51 representatives HLA-B alleles. The trees are generated using the neighbor-joining algorithm from HLA distance matrices as described in the text. The 12 common supertypes are highlighted in full line circles. The proposed novel (sub)-supertypes are highlighted in dotted circles.

**Figure 4. Definition of the HLA pseudo sequence.**
The upper part of the figure shows the residues of the HLA sequence estimated to be in contact with the peptide in the binding cleft. The columns give the HLA residue numbering according the IMGT nomenclature. The rows demonstrate the interactions with the nine peptides positions. Squares in grey outline the peptide positions estimated to have contact the corresponding HLA residue. The lower part of the figure shows the amino acid polymorphism at each position in the pseudo sequence, both those that are common for HLA-A and –B, and those that are unique for the HLA-A and HLA-B loci, respectively (as of February 2007).

See this image and copyright information in PMC

References

1. Lauemoller SL, Kesmir C, Corbet SL, Fomsgaard A, Holm A, et al. Identifying cytotoxic T cell epitopes from genomic and proteomic information: “The human MHC project.”. Rev Immunogenet. 2000;2:477–491. - PubMed
1. Yewdell JW, Bennink JR. Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. Annual Review of Immunology. 1999;17:51–88. - PubMed
1. Sette A, Fikes J. Epitope-based vaccines: an update on epitope identification, vaccine design and delivery. Curr Opin Immunol. 2003;15:461–470. - PubMed
1. Sette A, Sidney J. Nine major HLA class I supertypes account for the vast preponderance of HLA-A and –B polymorphism. Immunogenetics. 1999;50:201–212. - PubMed
1. Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, et al. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics. 2004;55:797–810. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence

Affiliation

NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials