Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Jan 9:12:1054231.
doi: 10.3389/fonc.2022.1054231. eCollection 2022.

Artificial intelligence applied in neoantigen identification facilitates personalized cancer immunotherapy

Affiliations
Review

Artificial intelligence applied in neoantigen identification facilitates personalized cancer immunotherapy

Yu Cai et al. Front Oncol. .

Abstract

The field of cancer neoantigen investigation has developed swiftly in the past decade. Predicting novel and true neoantigens derived from large multi-omics data became difficult but critical challenges. The rise of Artificial Intelligence (AI) or Machine Learning (ML) in biomedicine application has brought benefits to strengthen the current computational pipeline for neoantigen prediction. ML algorithms offer powerful tools to recognize the multidimensional nature of the omics data and therefore extract the key neoantigen features enabling a successful discovery of new neoantigens. The present review aims to outline the significant technology progress of machine learning approaches, especially the newly deep learning tools and pipelines, that were recently applied in neoantigen prediction. In this review article, we summarize the current state-of-the-art tools developed to predict neoantigens. The standard workflow includes calling genetic variants in paired tumor and blood samples, and rating the binding affinity between mutated peptide, MHC (I and II) and T cell receptor (TCR), followed by characterizing the immunogenicity of tumor epitopes. More specifically, we highlight the outstanding feature extraction tools and multi-layer neural network architectures in typical ML models. It is noted that more integrated neoantigen-predicting pipelines are constructed with hybrid or combined ML algorithms instead of conventional machine learning models. In addition, the trends and challenges in further optimizing and integrating the existing pipelines are discussed.

Keywords: artificial intelligence; cancer immunotherapy; cancer neoantigen; neoantigen prediction; next generation sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
A typical multi-layer neural network architecture showing the feature extraction and prediction procedures for neoantigen predicting. Tumor (tissue) and normal (PBMC) samples are processed by NGS-based genomic profiling (WES/WGS etc.) and undergo bioinformatic analysis to produce input data for machine learning training. The key features (tumor abundance etc.) extracted from the input data are fed to a typical neural network (deep learning) model and filters to output a predicting score (or a predicted value) for the class (positive or negative for neoantigen) the input data belongs to. The colored circles represent the input data collected from tumor components (pink, T1 to Tn) and normal cells or immune receptors (purple or sapphire, N1 to Nn), as well as the feature variables (yellow, F1 to Fn) extracted from the omics data input. The gray arrows that connect the circles shows how all the neurons are interconnected and stacked together to constitute a layer, and the multiple layers piled next to each other to construct the neural network model. Figure was created with BioRender.com.
Figure 2
Figure 2
A proposed neoantigen-predicting workflow implemented with machine learning (ML) models targeting individual characteristics. A group of verified neoantigen data is split into two datasets for Training + Development and Testing respectively. (A) Upper dotted line box: Model Training and Development. 1) individual features of the training data with known class (positive or negative for neoantigen) are either produced from NGS profiling directly or indirectly as additional rounds of analysis may apply to generate the variables; 2) as indicated by dashed arrow lines, these feature variables act as input in three ML models (colored boxes) targeting three characteristics: peptide-MHC binding (model a, sapphire), TCR-pMHC binding (model b, yellow) and Immunogenicity (model c, pink); 3) as indicated by the arrow lines, each model learns from its own input data and generate a prediction or together produces an integrated prediction; 4) ML model compare its prediction against the true class of the training data and learn from this training, following by optimization aiming for a better prediction. (B) Lower dotted line box: Independent Validation and Testing. After the predictive models trained and developed, a candidate neoantigen will undergo NGS-based genomic profiling and generated input data, followed by processing in the three trained models (a-c, colored clouds) and eventually provide the predictions. Figure was created with BioRender.com.

References

    1. Wells DK, van Buuren MM, Dang KK, Hubbard-Lucey VM, Sheehan KCF, Campbell KM, et al. . Key parameters of tumor epitope immunogenicity revealed through a consortium approach improve neoantigen prediction. Cell (2020) 183(3):818–34.e13. doi: 10.1016/j.cell.2020.09.015 - DOI - PMC - PubMed
    1. Bhinder B, Gilvary C, Madhukar NS, Elemento O. Artificial intelligence in cancer research and precision medicine. Cancer Discovery (2021) 11(4):900–15. doi: 10.1158/2159-8290.Cd-21-0090 - DOI - PMC - PubMed
    1. Lee K, Jang S, Kim KL, Koo M, Park C, Lee S, et al. . Artificially intelligent tactile ferroelectric skin. Advanced Sci (Weinheim Baden-Wurttemberg Germany) (2020) 7(22):2001662. doi: 10.1002/advs.202001662 - DOI - PMC - PubMed
    1. Cristiano S, Leal A, Phallen J, Fiksel J, Adleff V, Bruhm DC, et al. . Genome-wide cell-free DNA fragmentation in patients with cancer. Nature (2019) 570(7761):385–9. doi: 10.1038/s41586-019-1272-6 - DOI - PMC - PubMed
    1. Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, et al. . Predicting splicing from primary sequence with deep learnin. g. Cell (2019) 176(3):535–48.e24. doi: 10.1016/j.cell.2018.12.015 - DOI - PubMed