. 2021 May 24;11(1):10780.

doi: 10.1038/s41598-021-89927-5.

Predicting MHC I restricted T cell epitopes in mice with NAP-CNB, a novel online tool

Affiliations

¹ Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, 28049, Madrid, Spain.
² Departamento de Bioingenieria e Ingenieria Aeroespacial, Universidad Carlos III de Madrid, 28911, Leganés, Spain.
³ Bioengineering Department, Imperial College London, London, SW7 2AZ, UK.
⁴ Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program (ODAP), Catalan Institute of Oncology (ICO), Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908, L'Hospitalet de Llobregat, Spain.
⁵ Centro De Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP), Madrid, Spain.
⁶ Procure Program, Institut Català d'Oncologia- Oncobell Program, Catalan Institute of Oncology (ICO), Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908, L'Hospitalet de Llobregat, Spain.
⁷ Departamento de Bioingenieria e Ingenieria Aeroespacial, Universidad Carlos III de Madrid, 28911, Leganés, Spain. mamunozb@ing.uc3m.es.
⁸ Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), 28007, Madrid, Spain. mamunozb@ing.uc3m.es.

^# Contributed equally.

PMID: 34031450
PMCID: PMC8144223
DOI: 10.1038/s41598-021-89927-5

Predicting MHC I restricted T cell epitopes in mice with NAP-CNB, a novel online tool

Carlos Wert-Carvajal et al. Sci Rep. 2021.

. 2021 May 24;11(1):10780.

doi: 10.1038/s41598-021-89927-5.

Authors

Affiliations

¹ Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, 28049, Madrid, Spain.
² Departamento de Bioingenieria e Ingenieria Aeroespacial, Universidad Carlos III de Madrid, 28911, Leganés, Spain.
³ Bioengineering Department, Imperial College London, London, SW7 2AZ, UK.
⁴ Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program (ODAP), Catalan Institute of Oncology (ICO), Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908, L'Hospitalet de Llobregat, Spain.
⁵ Centro De Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP), Madrid, Spain.
⁶ Procure Program, Institut Català d'Oncologia- Oncobell Program, Catalan Institute of Oncology (ICO), Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908, L'Hospitalet de Llobregat, Spain.
⁷ Departamento de Bioingenieria e Ingenieria Aeroespacial, Universidad Carlos III de Madrid, 28911, Leganés, Spain. mamunozb@ing.uc3m.es.
⁸ Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), 28007, Madrid, Spain. mamunozb@ing.uc3m.es.

^# Contributed equally.

PMID: 34031450
PMCID: PMC8144223
DOI: 10.1038/s41598-021-89927-5

Abstract

Lack of a dedicated integrated pipeline for neoantigen discovery in mice hinders cancer immunotherapy research. Novel sequential approaches through recurrent neural networks can improve the accuracy of T-cell epitope binding affinity predictions in mice, and a simplified variant selection process can reduce operational requirements. We have developed a web server tool (NAP-CNB) for a full and automatic pipeline based on recurrent neural networks, to predict putative neoantigens from tumoral RNA sequencing reads. The developed software can estimate H-2 peptide ligands, with an AUC comparable or superior to state-of-the-art methods, directly from tumor samples. As a proof-of-concept, we used the B16 melanoma model to test the system's predictive capabilities, and we report its putative neoantigens. NAP-CNB web server is freely available at http://biocomp.cnb.csic.es/NeoantigensApp/ with scripts and datasets accessible through the download section.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Workflow for the integrated pipeline. (a) The user interface of NAP-CNB with the fields required for NGS analysis. Users can introduce filters of GATK for base quality score recallibration (BQSR) of RNA-Seq reads, minimum depth coverage (DP) and allele frequency (AF). Additionally, users may submit peptidic sequences for affinity prediction. Individual submissions are haplotype-specific, and results are sent to an email address. (b) Workflow for the integrated pipeline. Firstly, the sample is preprocessed before variant calling. Quality control through FastQC and STAR alignment with the reference genome is followed with protocols from Best Practices of GATK. Known variants are introduced through known polymorphisms or a panel-of-normals if requested, andsufficient non-tumor RNA-Seq reads are provided. MuTect2 is used for variant calling, and plausible single nucleotide variant (SNV) mutations translated into peptidic sequences for prediction with the RNN model. Gene expression is quantified through Cuffquant in Cufflinks.

**Figure 2**
Neural network model of the binding affinity prediction for H-2K^b. The input sequence corresponds to a one-hot encoding of a 12 mer peptide sequence extracted from the preprocessing workflow. The number of LSTM units corresponds to the input sequence’s overall length across the three consecutive layers. Following the RNN, two hidden dense units, with alternating dropouts, serve to process an affinity probability.

**Figure 3**
ROC and precision-recall curves for the final model trained with H-2K^b samples. (a) ROC curve for 10% test partition with an AUC of 86.5%, the dashed line shows chance level. (b) Precision-recall curve with the prevalence of around 3% shown as chance. The precision-recall AUC is 41.97%, whereas a random guess corresponds to an AUC of 2.64% for the same data imbalance.

**Figure 4**
Cross-validation of peptide window sizes for H-2K^b. The area under the curve of the receiver operating characteristic curve using 8 mers, 9 mers, and 12 mers obtained through fivefold cross-validation in different conditions. The windows are obtained from the mutated peptide sequence centered at the location of the SNV. Significant differences between means (Student’s t-test, p $< 0.05$ ) are shown.

See this image and copyright information in PMC

References

1. Schumacher TN, Schreiber RD. Neoantigens in cancer immunotherapy. Science. 2015;348:69–74. doi: 10.1126/science.aaa4971. - DOI - PubMed
1. Waldman AD, Fritz JM, Lenardo MJ. A guide to cancer immunotherapy: From T cell basic science to clinical practice. Nat. Rev. Immunol. 2020 doi: 10.1038/s41577-020-0306-5. - DOI - PMC - PubMed
1. Hundal J, et al. pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Med. 2016;8:1–11. doi: 10.1186/s13073-016-0264-5. - DOI - PMC - PubMed
1. Richters MM, et al. Best practices for bioinformatic characterization of neoantigens for clinical utility. Genome Med. 2019;11:56. doi: 10.1186/s13073-019-0666-2. - DOI - PMC - PubMed
1. Rubinsteyn A, et al. Computational pipeline for the PGV-001 neoantigen vaccine trial. Front. Immunol. 2018;8:1–7. doi: 10.3389/fimmu.2017.01807. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Predicting MHC I restricted T cell epitopes in mice with NAP-CNB, a novel online tool

Affiliations

Predicting MHC I restricted T cell epitopes in mice with NAP-CNB, a novel online tool

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials