Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 22;25(1):bbad504.
doi: 10.1093/bib/bbad504.

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Affiliations

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Jeffrey K Weber et al. Brief Bioinform. .

Abstract

Immunologic recognition of peptide antigens bound to class I major histocompatibility complex (MHC) molecules is essential to both novel immunotherapeutic development and human health at large. Current methods for predicting antigen peptide immunogenicity rely primarily on simple sequence representations, which allow for some understanding of immunogenic features but provide inadequate consideration of the full scale of molecular mechanisms tied to peptide recognition. We here characterize contributions that unsupervised and supervised artificial intelligence (AI) methods can make toward understanding and predicting MHC(HLA-A2)-peptide complex immunogenicity when applied to large ensembles of molecular dynamics simulations. We first show that an unsupervised AI method allows us to identify subtle features that drive immunogenicity differences between a cancer neoantigen and its wild-type peptide counterpart. Next, we demonstrate that a supervised AI method for class I MHC(HLA-A2)-peptide complex classification significantly outperforms a sequence model on small datasets corrected for trivial sequence correlations. Furthermore, we show that both unsupervised and supervised approaches reveal determinants of immunogenicity based on time-dependent molecular fluctuations and anchor position dynamics outside the MHC binding groove. We discuss implications of these structural and dynamic immunogenicity correlates for the induction of T cell responses and therapeutic T cell receptor design.

Keywords: MHC–peptide complex; Markov models; cancer immunotherapy; graph convolutions; immunogenicity; molecular dynamics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Slowest relaxation timescales estimated by unsupervised AI highlight concerted MHC–peptide dynamics and multiple peptide presentation modes. (A) Illustration of MHC–peptide complex system specific to the HLA-A2 supertype and representative of the cancer neoantigen/WT peptide pair under study. (B) MHC binding groove dynamics that represent the slowest timescale observed for both the neoantigen and WT peptides, with quantitative timescales presented for the neoantigen system. (C) Interconversion between peptide presentation modes that represents the second-slowest timescale observed for both neoantigen and WT peptides. P(Imm) values correspond to softmax outputs for each conformation derived from the best MD-graph immunogenicity prediction model presented in Section 2.
Figure 2
Figure 2
Presentation differences between cancer neoantigen and WT counterpart identified with unsupervised AI. (A) Dominant neoantigen and WT presentation modes drawn from the most populated Markov states of each model. (B) Quantitative estimates of residual solvent exposure differences between neoantigen and WT identified via population-weighted averages over the five most-populated Markov states in each model. (C) Equilibration of conformational populations propagated by each Markov model and projected onto top two time-components (tICs) obtained through dimensionality reduction of complex-wide dihedral angle vectors.
Figure 3
Figure 3
Classification of class I MHC–peptide complex immunogenicity based on molecular dynamics data. (A) MHC–peptide complex system subjected to MD simulation and class-differentiating observables distinguished by protein–peptide complex dynamics. (B) MD-graph deep learning architecture for MHC–peptide complex classification. (C) Performance comparison of MD-SASA, MD-Graph and sequence deep learning methods on the full A02 dataset.
Figure 4
Figure 4
Comparison of predictions derived from sequence-only and MD-graph deep learning architectures. (A) Distinct predictions derived from MD-graph and sequence immunogenicity models, as represented by softmax values output from each model type. (B) Orthogonal UMAP projections of internal representation vectors derived from sequence and MD-graph model types. Sequence model projections are shown in red (lower left) and MD-graph model projections are shown in blue (upper right). (C) Anecdotal example of MD-graph rescue of a sequence model prediction in the context of peptide LLILCVTQV.
Figure 5
Figure 5
Superior MD model performance on smaller training data sets corrected for trivial sequence correlations. (A) Illustration of Monte Carlo approach to sequence split debiasing. (B) Sequence-only and MD-graph classification results comparison across 20 small datasets subjected to a correlation correction procedure.
Figure 6
Figure 6
Anchor position spatial dynamics determine T cell immunogenicity. (A) Illustration of HLA-A02 supertype anchor binding pockets and HLA-A02-restricted peptide anchor positions. (B) Mean peptide anchor residue/MHC anchor pocket distances as a function of time, with error bars representing the full sampled conformational distributions. Computed P-values for mean peptide anchor distances and the anchor distance fluctuations (standard deviations; SDs) between the immunogenic and non-immunogenic peptide sets are presented in each subpanel for the 150–200 ns trajectory windows. (C) C-terminal peptide dynamics in two immunogenic peptide systems. Immunogenicity probability predictions are shown at right. (D) Backbone and sidechain RMSF values as a function of peptide position and immunogenicity class. Solid lines indicate mean values and shaded areas capture SDs. (E) Backbone RMSF values as a function of MHC-α position and immunogenicity class, with MHC groove helix residues highlighted with vertical shading and defined as being within approximately 8 Å of a reference peptide. Means and SDs are virtually indistinguishable between classes. (F) Correlation plot showing lack of relationship between quantitative binding affinity and P9 anchor dynamics.

Similar articles

Cited by

References

    1. Masopust D, Vezys V, Wherry EJ, Masopust D. A brief history of CD8 T cells. Eur J Immunol 2007;37:S103–10. - PubMed
    1. Falk K, Rötzschke O, Stevanovié S, et al. . Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature 1991;351:290–6. - PubMed
    1. Gartner JJ, Parkhurst MR, Gros A, et al. . A machine learning model for ranking candidate HLA class I neoantigens based on known neoepitopes from multiple human tumor types. Nat Cancer 2021;2:563–74. - PMC - PubMed
    1. Bulik-Sullivan B, Busby J, Palmer CD, et al. . Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nat Biotechnol 2019;37:55–63. - PubMed
    1. Bear AS, Blanchard T, Cesare J, et al. . Biochemical and functional characterization of mutant KRAS epitopes validates this oncoprotein for immunological targeting. Nat Commun 2021;12:1–16. - PMC - PubMed

Publication types