. 2024 Oct 28:13:RP94658.

doi: 10.7554/eLife.94658.

The T cell receptor β chain repertoire of tumor infiltrating lymphocytes improves neoantigen prediction and prioritization

Thi Mong Quynh Pham¹, Thanh Nhan Nguyen¹, Bui Que Tran Nguyen¹, Thi Phuong Diem Tran¹, Nguyen My Diem Pham¹, Hoang Thien Phuc Nguyen¹, Thi Kim Cuong Ho¹, Dinh Viet Linh Nguyen¹, Huu Thinh Nguyen², Duc Huy Tran², Thanh Sang Tran², Truong Vinh Ngoc Pham², Minh Triet Le², Thi Tuong Vy Nguyen¹, Minh-Duy Phan¹, Hoa Giang¹, Hoai-Nghia Nguyen¹, Le Son Tran¹

Affiliations

¹ Medical Genetics Institute, Ho Chi Minh City, Viet Nam.
² University Medical Center Ho Chi Minh City, Ho Chi Minh City, Viet Nam.

PMID: 39466298
PMCID: PMC11517254
DOI: 10.7554/eLife.94658

The T cell receptor β chain repertoire of tumor infiltrating lymphocytes improves neoantigen prediction and prioritization

Thi Mong Quynh Pham et al. Elife. 2024.

. 2024 Oct 28:13:RP94658.

doi: 10.7554/eLife.94658.

Authors

Affiliations

¹ Medical Genetics Institute, Ho Chi Minh City, Viet Nam.
² University Medical Center Ho Chi Minh City, Ho Chi Minh City, Viet Nam.

PMID: 39466298
PMCID: PMC11517254
DOI: 10.7554/eLife.94658

Abstract

In the realm of cancer immunotherapy, the meticulous selection of neoantigens plays a fundamental role in enhancing personalized treatments. Traditionally, this selection process has heavily relied on predicting the binding of peptides to human leukocyte antigens (pHLA). Nevertheless, this approach often overlooks the dynamic interaction between tumor cells and the immune system. In response to this limitation, we have developed an innovative prediction algorithm rooted in machine learning, integrating T cell receptor β chain (TCRβ) profiling data from colorectal cancer (CRC) patients for a more precise neoantigen prioritization. TCRβ sequencing was conducted to profile the TCR repertoire of tumor-infiltrating lymphocytes (TILs) from 28 CRC patients. The data unveiled both intra-tumor and inter-patient heterogeneity in the TCRβ repertoires of CRC patients, likely resulting from the stochastic utilization of V and J segments in response to neoantigens. Our novel combined model integrates pHLA binding information with pHLA-TCR binding to prioritize neoantigens, resulting in heightened specificity and sensitivity compared to models using individual features alone. The efficacy of our proposed model was corroborated through ELISpot assays on long peptides, performed on four CRC patients. These assays demonstrated that neoantigen candidates prioritized by our combined model outperformed predictions made by the established tool NetMHCpan. This comprehensive assessment underscores the significance of integrating pHLA binding with pHLA-TCR binding analysis for more effective immunotherapeutic strategies.

Keywords: CRC; TCR sequencing; TCRseq; cancer biology; cancer immunotherapy; colorectal cancer; human; neoantigen; neoantigen prioritization; tumor variant calling.

PubMed Disclaimer

Conflict of interest statement

TP, TN, BT, TD, ND, HP, TC, DL, HN, DT, TT, TP, ML, TV, MP, HG, HN, LT No competing interests declared

Figures

**Figure 1.. A novel workflow based on machine learning that integrates T cell receptor β (TCRβ) sequencing data for the identification and ranking of colorectal cancer (CRC) neoantigens.**
(A) Tumor biopsies and peripheral blood from CRC patients were subjected to targeted DNA-seq, RNA-seq, and T cell receptor (TCR)-seq. (B) The prediction of peptide-human leukocyte antigen (HLA) binding and peptide-HLA-TCR binding by indicated tools using the DNA-seq, RNA-seq, and TCR-seq data was performed. (C) Machine learning models were subsequently constructed based on the analysis of the peptide-HLA binding and peptide-HLA-TCR binding features to distinguish immunogenic antigens from non-immunogenic peptides. (D) The immunogenicity of predicted neoantigen candidates prioritized by the model was validated by enzyme-linked immunospot (ELISpot) to evaluate the effectiveness of this approach.

**Figure 2.. Tumor-infiltrating T cell receptor β (TCRβ) profiles in 28 colorectal cancer patients.**
(A) A bar plot depicting the distribution of T cell receptor (TCR) clonotypes among 28 colorectal cancer (CRC) patients, categorized into two groups: those with a unique read count and those with read counts greater than or equal to 2 for each TCR clonotype. (B) The scatter plot illustrates the relationship between the Shannon-index and the number of TCR clones. (C) The rarefaction plot shows the variable between sample size and diversity among 28 CRC samples.

**Figure 2—figure supplement 1.. Quality control metrics for tumor-infiltrating lymphocyte (TIL) T cell receptor β (TCRβ) analysis.**
(A) Distribution of CDR3β lengths in total T cell receptor (TCR) clones. (B) The pie chart displays the recurrence rates of TCR clones, variable (V) segments, and joining (J) segments when the read count of TCR clones exceeds 01. The graph illustrates the uniqueness of TCR clones and the shared presence of both V and J segments. (C) The heatmap depicts the Z-scored read counts of V segments or (D) J segments across 28 samples. Some V and J segments were found to be dominant in all samples. (E) The chord diagram illustrates the rearrangement of V and J segments, revealing random V and J combinations, with a few combinations exhibiting high frequencies.

**Figure 2—figure supplement 2.. Association between tumor-infiltrating lymphocyte (TIL) T cell receptor β (TCRβ) profiles and patients' characteristics.**
The bar plot and dot plot compare T cell receptor (TCR) clones, Shannon index, and clonality between high microsatellite instability (MSI-H) and microsatellite stability (MSS) (**A, B, C**), stage II and III (**D, E, F**), female and male gender (**G, H, I**), and distal and proximal tumor locations (**K, L, M**).

**Figure 2—figure supplement 3.. Rarefaction between microsatellite instability (MSI) and microsatellite stability (MSS) samples.**
The rarefaction plot illustrates the sample size and diversity of samples in two groups: MSI and MSS.

**Figure 3.. Peptide-T cell receptor (TCR) and peptide-human leukocyte antigen (HLA) interactions are two complementary determinants of neoantigen immunogenicity.**
(A) The histogram displays the HLA percentile distribution of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar). (B) The percentage of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar) is compared between two groups based on HLA percentile:<2% and ≥ 2% (Chi-square test, p<0.00001). (C) The histogram displays the TCR ranking distribution of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar). (D) The percentage of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar) is compared between two groups based on TCR ranking:<2% and ≥ 2% (Chi-square test, p=0.086). (E) The scatter plot illustrates the relationship between the HLA percentile distribution and TCR ranking of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar). (F) The percentage of immunogenic antigens (red bar) and non-immunogenic peptides (gray bar) is analyzed in four distinct groups based on cutoffs of HLA percentile and TCR ranking. (G) The bar plot illustrates the sensitivity and specificity of three neoantigen prioritization approaches: based on neoantigen-HLA binding affinity alone (yellow bar), neoantigen-TCR binding ranking alone (blue bar), and the combined method using both features (red bar).

**Figure 4.. The combined model demonstrates improved sensitivity and specificity for neoantigen prioritization.**
(A) The workflow for constructing the model. (B) The receiver operating characteristic (ROC) curves demonstrate the performance of both the combined model and individual models in both the discovery and validation cohorts. The bar graphs illustrate the sensitivity (C), negative predictive value (NPV) (D), and positive predictive value (PPV) (E) at specificity levels of at least 95% or 99% for both the combined and individual models in both the discovery and validation cohorts. (F) Ranking coverage scores for the specified models in either the discovery or validation cohorts.

**Figure 4—figure supplement 1.. Dataset construction workflow.**

**Figure 4—figure supplement 2.. The performance of three machine learning models with three different algorithms is evaluated using receiver operating characteristic (ROC) curves.**
The curves depict the performance of the combined model in the discovery cohort (A) and the validation cohort (B).

**Figure 5.. Validation of neoantigens identified in silico from the novel workflow through enzyme-linked immunospot (ELISpot) assays conducted on four colorectal cancer (CRC) patients.**
(A) A schematic diagram illustrates the procedural steps of neoantigen prioritization and the ELISpot assay. (B) The count of neoantigens identified from each pipeline. (C) The fold change in IFN-γ spots, relative to the wild-type peptides, is shown for 21 long peptides. Note: Only the mutants that result in a positive value in ELISpot are depicted, along with their corresponding amino acid changes and their associated rankings. (D) ELISpot assays on six long peptides resulting in at least a twofold change in IFN-γ spots. (E) The bar graphs display the ranking of validated long peptides identified from the NetMHCpan tool (blue bar) or the combined method (red bar) for individual patients and all patients.

**Figure 5—figure supplement 1.. The rank coverage score of the combined model compared to NetMHCpan.**
The bar graphs display rank coverage scores of validated long peptides identified by the NetMHCpan tool (blue bars) and the combined method (red bars) for individual patients and all patients collectively.

See this image and copyright information in PMC

Update of

References

1. 10x Genomics 2024a. CD8+ T cells of Healthy Donor 1. 10x Genomics. https://www.10xgenomics.com/resources/datasets/cd-8-plus-t-cells-of-heal...
1. 10x Genomics 2024b. CD8+ T cells of Healthy Donor 2. 10x Genomics. https://www.10xgenomics.com/resources/datasets/cd-8-plus-t-cells-of-heal...
1. 10x Genomics 2024c. CD8+ T cells of Healthy Donor 3. 10x Genomics. https://www.10xgenomics.com/resources/datasets/cd-8-plus-t-cells-of-heal...
1. 10x Genomics 2024d. CD8+ T cells of Healthy Donor 4. 10x Genomics. https://www.10xgenomics.com/resources/datasets/cd-8-plus-t-cells-of-heal...
1. Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, Meyer L, Gress DM, Byrd DR, Winchester DP. The Eighth Edition AJCC cancer staging manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA. 2017;67:93–99. doi: 10.3322/caac.21388. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

NC01/NextCalibur Therapeutic

LinkOut - more resources

Full Text Sources
- PubMed Central
- eLife Sciences Publications, Ltd
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The T cell receptor β chain repertoire of tumor infiltrating lymphocytes improves neoantigen prediction and prioritization

Affiliations

The T cell receptor β chain repertoire of tumor infiltrating lymphocytes improves neoantigen prediction and prioritization

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical