Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Mar 4;26(2):bbaf087.
doi: 10.1093/bib/bbaf087.

Integration of proteomics profiling data to facilitate discovery of cancer neoantigens: a survey

Affiliations
Review

Integration of proteomics profiling data to facilitate discovery of cancer neoantigens: a survey

Shifu Luo et al. Brief Bioinform. .

Abstract

Cancer neoantigens are peptides that originate from alterations in the genome, transcriptome, or proteome. These peptides can elicit cancer-specific T-cell recognition, making them potential candidates for cancer vaccines. The rapid advancement of proteomics technology holds tremendous potential for identifying these neoantigens. Here, we provided an up-to-date survey about database-based search methods and de novo peptide sequencing approaches in proteomics, and we also compared these methods to recommend reliable analytical tools for neoantigen identification. Unlike previous surveys on mass spectrometry-based neoantigen discovery, this survey summarizes the key advancements in de novo peptide sequencing approaches that utilize artificial intelligence. From a comparative study on a dataset of the HepG2 cell line and nine mixed hepatocellular carcinoma proteomics samples, we demonstrated the potential of proteomics for the identification of cancer neoantigens and conducted comparisons of the existing methods to illustrate their limits. Understanding these limits, we suggested a novel workflow for neoantigen discovery as perspectives.

Keywords: cancer neoantigens; database-based search methods; de novo peptide sequencing; deep learning; proteomics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The pipeline of cancer neoantigen screening in silico and the pathways of neoantigen generation. (A) The process of antigen presentation by MHC class I/II molecules in APCs and the activation of T cell responses, along with the corresponding neoantigen screening tools for each step. MHC class I molecules are responsible for endogenous antigen presentation and CD8+ T cell activation. MHC class II molecules are responsible for exogenous antigen presentation and CD4+ T cell activation. (B) The mechanism of neoantigen generation includes genomic variations (SNVs, indels, and gene mutations) and transcriptome alternative splicing variants, which make protein products diversified.
Figure 2
Figure 2
Shotgun proteomics workflow. The actual spectra are obtained by protein cleavage and mass spectrometry, while the theoretical spectra are obtained by interrupting the reference protein sequence according to the theoretical site of the corresponding cleavage method. Peptide identification is achieved by matching and scoring theoretical spectra against actual spectra, and peptides are assembled to achieve protein identification.
Figure 3
Figure 3
The co-evolution trajectory in the development of both the peptide de novo sequencing methods and AI algorithms. (A) the peptide de novo sequencing methods have undergone a revolution from machine learning algorithms to deep learning algorithms (from left to right). (B) Performance of the peptide de novo sequencing methods on different datasets: (i) the number of correct peptides identified by pNovo [120] was 181 and 612 in the two datasets, which was better than PepNovo and peaks. (ii) Novor [121] outperforms peaks in recall on four datasets; (iii) Graphnovo [125] outperforms other methods in recall on all three datasets.
Figure 4
Figure 4
A case study of neoantigen identification in HCC patients. (A) Sample information on liver cancer and the workflow of neoantigen identification. (B) Venn diagram of identified neoantigen candidates under two methods. (C) Venn diagram of identified genes in three technical replicates by Casanovo. (D) The detection of high-frequency mutant genes in HCC under two methods. The list on the right shows the highly mutated genes in different HCC cohorts.
Figure 5
Figure 5
The proteomics-based neoantigen identification workflow. The process of neoantigen identification based on MS includes both database-dependent and database-independent methods. Depending on the type of data input, different analysis processes are selected, resulting in varying numbers of identified neoantigen candidates. The method of database search depends on a comprehensive reference sequence library or self-reference sequence library (RNA-seq data). Database-independent methods require a self-reference sequence library as validation: The sequences obtained by reverse transcription of the predicted peptides were matched with the cancer-specific sequence library obtained from RNA-seq data to obtain neoantigen candidates.

References

    1. Moore L, Cagan A, Coorens THH. et al. The mutational landscape of human somatic and germline cells. Nature 2021;597:381–6. 10.1038/s41586-021-03822-7 - DOI - PubMed
    1. Seferbekova Z, Lomakin A, Yates LR. et al. Spatial biology of cancer evolution. Nat Rev Genet 2023;24:295–313. 10.1038/s41576-022-00553-x - DOI - PubMed
    1. Harrington KJ, Nenclares P. The biology of cancer. Medicine 2023;51:1–6. 10.1016/j.mpmed.2022.10.001 - DOI
    1. Yarchoan M, Johnson BA 3rd, Lutz ER. et al. Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer 2017;17:209–22. 10.1038/nrc.2016.154 - DOI - PMC - PubMed
    1. Hacohen N, Fritsch EF, Carter TA. et al. Getting personal with neoantigen-based therapeutic cancer vaccines. Cancer Immunol Res 2013;1:11–5. 10.1158/2326-6066.Cir-13-0022 - DOI - PMC - PubMed

Substances