. 2023 Nov 22;25(1):bbad504.

doi: 10.1093/bib/bbad504.

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Jeffrey K Weber¹, Joseph A Morrone¹, Seung-Gu Kang¹, Leili Zhang¹, Lijun Lang¹, Diego Chowell², Chirag Krishna³, Tien Huynh¹, Prerana Parthasarathy^{4

5}, Binquan Luan¹, Tyler J Alban^{4

5}, Wendy D Cornell¹, Timothy A Chan^{4

5

6

7

8}

Affiliations

¹ IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598USA.
² Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029.
³ Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
⁴ Center for Immunotherapy and Precision Immuno-Oncology, Cleveland Clinic, Cleveland, OH 44195USA.
⁵ Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44015USA.
⁶ Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065USA.
⁷ Taussig Cancer Institute, Cleveland Clinic, Cleveland, OH 44015USA.
⁸ National Center for Regenerative Medicine, Cleveland Clinic, Cleveland, OH 44015USA.

PMID: 38233090
PMCID: PMC10793977
DOI: 10.1093/bib/bbad504

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Jeffrey K Weber et al. Brief Bioinform. 2023.

. 2023 Nov 22;25(1):bbad504.

doi: 10.1093/bib/bbad504.

Authors

Affiliations

¹ IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598USA.
² Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029.
³ Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
⁴ Center for Immunotherapy and Precision Immuno-Oncology, Cleveland Clinic, Cleveland, OH 44195USA.
⁵ Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44015USA.
⁶ Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065USA.
⁷ Taussig Cancer Institute, Cleveland Clinic, Cleveland, OH 44015USA.
⁸ National Center for Regenerative Medicine, Cleveland Clinic, Cleveland, OH 44015USA.

PMID: 38233090
PMCID: PMC10793977
DOI: 10.1093/bib/bbad504

Abstract

Immunologic recognition of peptide antigens bound to class I major histocompatibility complex (MHC) molecules is essential to both novel immunotherapeutic development and human health at large. Current methods for predicting antigen peptide immunogenicity rely primarily on simple sequence representations, which allow for some understanding of immunogenic features but provide inadequate consideration of the full scale of molecular mechanisms tied to peptide recognition. We here characterize contributions that unsupervised and supervised artificial intelligence (AI) methods can make toward understanding and predicting MHC(HLA-A2)-peptide complex immunogenicity when applied to large ensembles of molecular dynamics simulations. We first show that an unsupervised AI method allows us to identify subtle features that drive immunogenicity differences between a cancer neoantigen and its wild-type peptide counterpart. Next, we demonstrate that a supervised AI method for class I MHC(HLA-A2)-peptide complex classification significantly outperforms a sequence model on small datasets corrected for trivial sequence correlations. Furthermore, we show that both unsupervised and supervised approaches reveal determinants of immunogenicity based on time-dependent molecular fluctuations and anchor position dynamics outside the MHC binding groove. We discuss implications of these structural and dynamic immunogenicity correlates for the induction of T cell responses and therapeutic T cell receptor design.

Keywords: MHC–peptide complex; Markov models; cancer immunotherapy; graph convolutions; immunogenicity; molecular dynamics.

PubMed Disclaimer

Figures

**Figure 1**
Slowest relaxation timescales estimated by unsupervised AI highlight concerted MHC–peptide dynamics and multiple peptide presentation modes. (A) Illustration of MHC–peptide complex system specific to the HLA-A2 supertype and representative of the cancer neoantigen/WT peptide pair under study. (B) MHC binding groove dynamics that represent the slowest timescale observed for both the neoantigen and WT peptides, with quantitative timescales presented for the neoantigen system. (C) Interconversion between peptide presentation modes that represents the second-slowest timescale observed for both neoantigen and WT peptides. P(Imm) values correspond to softmax outputs for each conformation derived from the best MD-graph immunogenicity prediction model presented in Section 2.

**Figure 2**
Presentation differences between cancer neoantigen and WT counterpart identified with unsupervised AI. (A) Dominant neoantigen and WT presentation modes drawn from the most populated Markov states of each model. (B) Quantitative estimates of residual solvent exposure differences between neoantigen and WT identified via population-weighted averages over the five most-populated Markov states in each model. (C) Equilibration of conformational populations propagated by each Markov model and projected onto top two time-components (tICs) obtained through dimensionality reduction of complex-wide dihedral angle vectors.

**Figure 3**
Classification of class I MHC–peptide complex immunogenicity based on molecular dynamics data. (A) MHC–peptide complex system subjected to MD simulation and class-differentiating observables distinguished by protein–peptide complex dynamics. (B) MD-graph deep learning architecture for MHC–peptide complex classification. (C) Performance comparison of MD-SASA, MD-Graph and sequence deep learning methods on the full A02 dataset.

**Figure 4**
Comparison of predictions derived from sequence-only and MD-graph deep learning architectures. (A) Distinct predictions derived from MD-graph and sequence immunogenicity models, as represented by softmax values output from each model type. (B) Orthogonal UMAP projections of internal representation vectors derived from sequence and MD-graph model types. Sequence model projections are shown in red (lower left) and MD-graph model projections are shown in blue (upper right). (C) Anecdotal example of MD-graph rescue of a sequence model prediction in the context of peptide LLILCVTQV.

**Figure 5**
Superior MD model performance on smaller training data sets corrected for trivial sequence correlations. (A) Illustration of Monte Carlo approach to sequence split debiasing. (B) Sequence-only and MD-graph classification results comparison across 20 small datasets subjected to a correlation correction procedure.

**Figure 6**
Anchor position spatial dynamics determine T cell immunogenicity. (A) Illustration of HLA-A02 supertype anchor binding pockets and HLA-A02-restricted peptide anchor positions. (B) Mean peptide anchor residue/MHC anchor pocket distances as a function of time, with error bars representing the full sampled conformational distributions. Computed P-values for mean peptide anchor distances and the anchor distance fluctuations (standard deviations; SDs) between the immunogenic and non-immunogenic peptide sets are presented in each subpanel for the 150–200 ns trajectory windows. (C) C-terminal peptide dynamics in two immunogenic peptide systems. Immunogenicity probability predictions are shown at right. (D) Backbone and sidechain RMSF values as a function of peptide position and immunogenicity class. Solid lines indicate mean values and shaded areas capture SDs. (E) Backbone RMSF values as a function of MHC-α position and immunogenicity class, with MHC groove helix residues highlighted with vertical shading and defined as being within approximately 8 Å of a reference peptide. Means and SDs are virtually indistinguishable between classes. (F) Correlation plot showing lack of relationship between quantitative binding affinity and P9 anchor dynamics.

See this image and copyright information in PMC

Cited by

Machine Learning of Molecular Dynamics Simulations Provides Insights into the Modulation of Viral Capsid Assembly.
Pavlova A, Fan Z, Lynch DL, Gumbart JC. Pavlova A, et al. J Chem Inf Model. 2025 May 26;65(10):4844-4853. doi: 10.1021/acs.jcim.5c00274. Epub 2025 May 8. J Chem Inf Model. 2025. PMID: 40338128 Free PMC article.
Monitoring Immune Responses to Vaccination: A Focus on Single-Cell Analysis and Associated Challenges.
Montgomery L, Larbi A. Montgomery L, et al. Vaccines (Basel). 2025 Apr 16;13(4):420. doi: 10.3390/vaccines13040420. Vaccines (Basel). 2025. PMID: 40333304 Free PMC article. Review.
Molecular Modelling in Bioactive Peptide Discovery and Characterisation.
Agoni C, Fernández-Díaz R, Timmons PB, Adelfio A, Gómez H, Shields DC. Agoni C, et al. Biomolecules. 2025 Apr 3;15(4):524. doi: 10.3390/biom15040524. Biomolecules. 2025. PMID: 40305228 Free PMC article. Review.
Neoantigen immunogenicity landscapes and evolution of tumor ecosystems during immunotherapy with nivolumab.
Alban TJ, Riaz N, Parthasarathy P, Makarov V, Kendall S, Yoo SK, Shah R, Weinhold N, Srivastava R, Ma X, Krishna C, Mok JY, van Esch WJE, Garon E, Akerley W, Creelan B, Aanur N, Chowell D, Geese WJ, Rizvi NA, Chan TA. Alban TJ, et al. Nat Med. 2024 Nov;30(11):3209-3222. doi: 10.1038/s41591-024-03240-y. Epub 2024 Sep 30. Nat Med. 2024. PMID: 39349627 Free PMC article.

References

1. Masopust D, Vezys V, Wherry EJ, Masopust D. A brief history of CD8 T cells. Eur J Immunol 2007;37:S103–10. - PubMed
1. Falk K, Rötzschke O, Stevanovié S, et al. . Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature 1991;351:290–6. - PubMed
1. Gartner JJ, Parkhurst MR, Gros A, et al. . A machine learning model for ranking candidate HLA class I neoantigens based on known neoepitopes from multiple human tumor types. Nat Cancer 2021;2:563–74. - PMC - PubMed
1. Bulik-Sullivan B, Busby J, Palmer CD, et al. . Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nat Biotechnol 2019;37:55–63. - PubMed
1. Bear AS, Blanchard T, Cesare J, et al. . Biochemical and functional characterization of mutant KRAS epitopes validates this oncoprotein for immunological targeting. Nat Commun 2021;12:1–16. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Affiliations

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials