Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 22;26(1):bbae625.
doi: 10.1093/bib/bbae625.

Meta learning for mutant HLA class I epitope immunogenicity prediction to accelerate cancer clinical immunotherapy

Affiliations

Meta learning for mutant HLA class I epitope immunogenicity prediction to accelerate cancer clinical immunotherapy

Long Xu et al. Brief Bioinform. .

Abstract

Accurate prediction of binding between human leukocyte antigen (HLA) class I molecules and antigenic peptide segments is a challenging task and a key bottleneck in personalized immunotherapy for cancer. Although existing prediction tools have demonstrated significant results using established datasets, most can only predict the binding affinity of antigenic peptides to HLA and do not enable the immunogenic interpretation of new antigenic epitopes. This limitation results from the training data for the computational models relying heavily on a large amount of peptide-HLA (pHLA) eluting ligand data, in which most of the candidate epitopes lack immunogenicity. Here, we propose an adaptive immunogenicity prediction model, named MHLAPre, which is trained on the large-scale MS-derived HLA I eluted ligandome (mostly presented by epitopes) that are immunogenic. Allele-specific and pan-allelic prediction models are also provided for endogenous peptide presentation. Using a meta-learning strategy, MHLAPre rapidly assessed HLA class I peptide affinities across the whole pHLA pairs and accurately identified tumor-associated endogenous antigens. During the process of adaptive immune response of T-cells, pHLA-specific binding in the antigen presentation is only a pre-task for CD8+ T-cell recognition. The key factor in activating the immune response is the interaction between pHLA complexes and T-cell receptors (TCRs). Therefore, we performed transfer learning on the pHLA model using the pHLA-TCR dataset. In pHLA binding task, MHLAPre demonstrated significant improvement in identifying neoepitope immunogenicity compared with five state-of-the-art models, proving its effectiveness and robustness. After transfer learning of the pHLA-TCR data, MHLAPre also exhibited relatively superior performance in revealing the mechanism of immunotherapy. MHLAPre is a powerful tool to identify neoepitopes that can interact with TCR and induce immune responses. We believe that the proposed method will greatly contribute to clinical immunotherapy, such as anti-tumor immunity, tumor-specific T-cell engineering, and personalized tumor vaccine.

Keywords: HLA genotyping; deep learning; epitope specificity; immunoinformatics; transfer learning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Statistical information on immunogenicity data. (a) Sequence determination by mass spectrometry methods after elution of antigenic peptides; allele type determination by gene expression. (b) Amino acid site frequency plots of three HLA alleles binding antigenic peptides classified according to peptide length (HLA-A*02:01, HLA-B*07:02, HLA-C*24:02). (c) Raw data cleaning process, division ratio between training and test sets; (d) Observed peptide Length frequencies across alleles. HLA-A alleles bind longer peptides more frequently compared with HLA-B and -C alleles, which tend to bind shorter peptides. Panel a created with BioReader.com.
Figure 2
Figure 2
Workflow diagram of the model. (a) Data input structure of the MHLAPre model and model workflow. (b) Training process of MHLAPre IM against pHLA complex epitope immunogenicity. (c) Transfer learning and prediction process of MHLAPre TT in the pHLA-TCR scenario. Panels a-c created with BioReader.com.
Figure 3
Figure 3
Performance evaluation of different pHLA antigen affinity prediction algorithms. (a) The figure presents a comparative analysis of the predictive accuracy of several antigen presentation prediction algorithms. The top panel displays the AUROC for each algorithm, whereas the bottom panel shows the AUPRC. Each bar represents the average performance score of the respective algorithms—MHCflurry, NetMHCpan, MHCnuggets, MixMHCpred 2.2, and HLAthena—across all HLA types (ALL) as well as individually stratified for HLA-A, HLA-B, and HLA-C alleles. The error bars correspond to the standard deviation of the performance scores, encapsulating the variability of the algorithm’s predictive power. This comprehensive assessment underscores the varying degrees of efficacy that these computational tools exhibit when tasked with predicting the presentation of antigens by different HLA molecules. (b) The left panel shows the AUROC for each algorithm and the right panel shows the AUPRC. Each line represents the average performance score of different algorithms for different lengths of antigenic peptides.
Figure 4
Figure 4
Performance comparison of MHLAPre TT with different pHLA-TCR interaction prediction models. (a) Trend of loss function loss for the MHLAPre TT transfer learning process, where it can be observed that the MHLAPre TT model learnt new environmental knowledge and fitted quickly. (b) PPV of the top 2% for the comparison of different models. each bar or violin represents the average performance score of the respective algorithms pMTnet, PanPep, DLpTCR, ERGO2 across all HLA types (ALL) as well as individually stratified for HLA-A, HLA-B, and HLA-C alleles. (c,d) Mean AUROC and AUPRC for different model comparisons. (e, f) Mean AUROC and AUPRC plots grouped by HLA-A, -B, -C alleles.
Figure 5
Figure 5
MHLAPre TT gets excellent performance on independent dataset. (a) MHLAPre with undetected pHLA affinity scores, observed to be well predicted by the presence of high-frequency HLA alleles and alleles with sequence similarity to high-frequency genes in the training data. (b) Comparison of MHLAPre TT and pMHC-specific model performance (NetTCR2.0 formula image, NetTCR2.0 formula image+formula image, MixTCRPred). We obtained a dataset based on 10X Genomics single-cell sequencing data collation to restrict the score comparison of pHLA-TCR-positive samples with pHLA of A0201_GILGFVFTL.

Similar articles

Cited by

References

    1. Roemer MG, Advani RH, Redd RA. et al. .. Classical Hodgkin lymphoma with reduced formula image2M/MHC class I expression is associated with inferior outcome independent of 9p24.1 status. Cancer Immunol Res 2016;4:910–6. 10.1158/2326-6066.CIR-16-0201. - DOI - PMC - PubMed
    1. Garrido F, Aptsiauri N. Cancer immune escape: MHC expression in primary tumours versus metastases. Immunology 2019;158:255–66. 10.1111/imm.13114. - DOI - PMC - PubMed
    1. Hu Y, Wang Z, Hu H. et al. .. ACME: pan-specific peptide-MHC class I binding prediction through attention-based deep neural networks. Bioinformatics 2019;35:4946–54. 10.1093/bioinformatics/btz427. - DOI - PubMed
    1. Jensen KK, Andreatta M, Marcatili P. et al. .. Improved methods for predicting peptide binding affinity to MHC class II molecules. Immunology 2018;154:394–406. 10.1111/imm.12889. - DOI - PMC - PubMed
    1. Bassani-Sternberg M, Chong C, Guillaume P. et al. .. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLoS Comput Biol 2017;13:e1005725. 10.1371/journal.pcbi.1005725. - DOI - PMC - PubMed