Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 1;34(9):1522-1528.
doi: 10.1093/bioinformatics/btx820.

An automated benchmarking platform for MHC class II binding prediction methods

Affiliations

An automated benchmarking platform for MHC class II binding prediction methods

Massimo Andreatta et al. Bioinformatics. .

Abstract

Motivation: Computational methods for the prediction of peptide-MHC binding have become an integral and essential component for candidate selection in experimental T cell epitope discovery studies. The sheer amount of published prediction methods-and often discordant reports on their performance-poses a considerable quandary to the experimentalist who needs to choose the best tool for their research.

Results: With the goal to provide an unbiased, transparent evaluation of the state-of-the-art in the field, we created an automated platform to benchmark peptide-MHC class II binding prediction tools. The platform evaluates the absolute and relative predictive performance of all participating tools on data newly entered into the Immune Epitope Database (IEDB) before they are made public, thereby providing a frequent, unbiased assessment of available prediction tools. The benchmark runs on a weekly basis, is fully automated, and displays up-to-date results on a publicly accessible website. The initial benchmark described here included six commonly used prediction servers, but other tools are encouraged to join with a simple sign-up procedure. Performance evaluation on 59 data sets composed of over 10 000 binding affinity measurements suggested that NetMHCIIpan is currently the most accurate tool, followed by NN-align and the IEDB consensus method.

Availability and implementation: Weekly reports on the participating methods can be found online at: http://tools.iedb.org/auto_bench/mhcii/weekly/.

Contact: mniel@bioinformatics.dtu.dk.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow of the automatic benchmarking platform. The program checks on a weekly basis whether new references were added in the Immune Epitope Database (IEDB); when a new reference is detected, it is split into homogenous data sets consisting of unique combinations of MHC allele and measurement type; sufficiently large data sets (at least 10 data points, of which at least two are positive and at least two are negative) are sent to the participating servers, independently of where they are hosted, through a standardized RESTful protocol; the predictions are retrieved from the servers; performance values in terms of SRCC and AUC are calculated for each participant; the servers are ranked from best to worst according to their performance values; the results of the evaluation, including aggregated scores over historical evaluations, are displayed on a web page publicly accessible online
Fig. 2.
Fig. 2.
Amount of public data available for the benchmark. (A) Number of references and data sets that pass the criteria for inclusion in the benchmark by year of submission. (B) Cumulative number of MHC II binding data points in suitable data sets by year of submission
Fig. 3.
Fig. 3.
Predictive performance of the methods participating in the 2014–2016 benchmark in terms of SRCC (A) and AUC (B). Each dot represents one data set, and the width of the silhouettes is proportional to the density of points at different values. Solid horizontal bars show the mean performance of each method (Color version of this figure is available at Bioinformatics online.)
Fig. 4.
Fig. 4.
Relative ranks of the methods participating in the 2014–2016 benchmark. For each dataset, all methods are ranked based on SRCC (A) and AUC (B). The best performing server in terms of SRCC receives a rank of one, the worst performing server a rank of zero, and all remaining servers are assigned scores evenly spaced between zero and one. Ranks are binned in five intervals of equal size for the barplots. Servers are sorted from left to right based on the size of their top quintile. (C) Pairwise performance comparison of the methods on the subset of datasets shared by each pair. For each element M[x][y] in the heatmap, the cell is colored by the fraction of data sets for which method x outperforms method y (top of the diagonal in terms of SRCC, bottom of the diagonal in terms of AUC); ties are counted as 0.5. Values on the diagonal represent the total number of datasets in the 2014–2016 benchmark that can be evaluated by each method
Fig. 5.
Fig. 5.
Edit distance between peptides evaluated by the benchmark and the data used to train NetMHCIIpan. The edit distance is the minimal number of substitutions or terminal extensions required to mutate a peptide in the evaluation set into the most similar peptide in the training set, restricted to the same MHC molecule. The bar labeled with N identifies benchmarked peptides restricted to MHC molecules not present in the training data
Fig. 6.
Fig. 6.
The online webpage of the automated MHC class II prediction benchmark. Clicking on individual weekly entries shows detailed information on the data sets evaluated in that time period

References

    1. Andreatta M. et al. (2015) Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics, 67, 641–650. - PMC - PubMed
    1. Blum J.S. et al. (2013) Pathways of antigen processing. Annu. Rev. Immunol., 31, 443–473. - PMC - PubMed
    1. Bui H.-H. et al. (2005) Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications. Immunogenetics, 57, 304–314. - PubMed
    1. Caron E. et al. (2015) Analysis of MHC immunopeptidomes using mass spectrometry. Mol. Cell. Proteomics, 14, 3105–3117. - PMC - PubMed
    1. Dhanda S.K. et al. (2016) Novel in silico tools for designing peptide-based subunit vaccines and immunotherapeutics. Brief. Bioinform., 18, 467–478. - PubMed

Publication types

Substances