Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 29;13(4):763-784.
doi: 10.21037/tlcr-23-800. Epub 2024 Apr 25.

Molecular and immune characterization of Chinese early-stage non-squamous non-small cell lung cancer: a multi-omics cohort study

Affiliations

Molecular and immune characterization of Chinese early-stage non-squamous non-small cell lung cancer: a multi-omics cohort study

Haoxin Peng et al. Transl Lung Cancer Res. .

Abstract

Background: Albeit considered with superior survival, around 30% of the early-stage non-squamous non-small cell lung cancer (Ns-NSCLC) patients relapse within 5 years, suggesting unique biology. However, the biological characteristics of early-stage Ns-NSCLC, especially in the Chinese population, are still unclear.

Methods: Multi-omics interrogation of early-stage Ns-NSCLC (stage I-III), paired blood samples and normal lung tissues (n=76) by whole-exome sequencing (WES), RNA sequencing, and T-cell receptor (TCR) sequencing were conducted.

Results: An average of 128 exonic mutations were identified, and the most frequently mutant gene was EGFR (55%), followed by TP53 (37%) and TTN (26%). Mutations in MUC17, ABCA2, PDE4DIP, and MYO18B predicted significantly unfavorable disease-free survival (DFS). Moreover, cytobands amplifications in 8q24.3, 14q13.1, 14q11.2, and deletion in 3p21.1 were highlighted in recurrent cases. Higher incidence of human leukocyte antigen loss of heterozygosity (HLA-LOH), higher tumor mutational burden (TMB) and tumor neoantigen burden (TNB) were identified in ever-smokers than never-smokers. HLA-LOH also correlated with higher TMB, TNB, intratumoral heterogeneity (ITH), and whole chromosomal instability (wCIN) scores. Interestingly, higher ITH was an independent predictor of better DFS in early-stage Ns-NSCLC. Up-regulation of immune-related genes, including CRABP2, ULBP2, IL31RA, and IL1A, independently portended a dismal prognosis. Enhanced TCR diversity of peripheral blood mononuclear cells (PBMCs) predicted better prognosis, indicative of a noninvasive method for relapse surveillance. Eventually, seven machine-learning (ML) algorithms were employed to evaluate the predictive accuracy of clinical, genomic, transcriptomic, and TCR repertoire data on DFS, showing that clinical and RNA features combination in the random forest (RF) algorithm, with area under the curve (AUC) of 97.5% and 83.3% in the training and testing cohort, respectively, significantly outperformed other methods.

Conclusions: This study comprehensively profiled the genomic, transcriptomic, and TCR repertoire spectrums of Chinese early-stage Ns-NSCLC, shedding light on biological underpinnings and candidate biomarkers for prognosis development.

Keywords: Early-stage lung cancer (early-stage LC); disease-free survival (DFS); machine-learning (ML); multi-omics.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-23-800/coif). W.L. serves as an unpaid Associate Editor-in-Chief of Translational Lung Cancer Research from May 2023 to April 2024. Xiaoli Cui, D.W., Z.G., and H.L. are current employees of YuceBio Technology Co., Ltd. The other authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1
Mutation landscape of Chinese early-stage Ns-NSCLC in the GZMU1H cohort. Paired sample information for multi-omics interrogation (A). Mutation profiles of Chinese early-stage Ns-NSCLC, each column representing an individual patient. The upper barplot demonstrates mutational load, and the right barplot shows the mutation frequency of individual genes (B). Summary of genetic variants, including numbers, classifications, and types (C). Somatic mutational burden differences concerning different driver genes, including TP53 (D), KRAS (E), ALK (F), and EGFR (G). DFS differences between different levels of somatic mutational burden (H). Monogenic mutation, including MUC17 (I), ABCA2 (J), PDE4DIP (K), and MYO18B (L), showed prognostic significance of DFS. Somatic copy number alteration differences of patients without (M) or with (N) relapse. Single-base substitution signature differences concerning smoking status (O) and relapse status (P). Comparison of continuous data by Wilcoxon t-test, **, P<0.01. Data were presented by mean ± standard error for (D-G). Data were presented by box and whisker plots for (C,P). The horizontal bar inside the boxes represents the median, and the lower and upper ends of the boxes are the first and third quartiles. The whiskers indicate values within 1.5× the inter-quartile range from the upper or lower quartile. The dots represent the value of the individual sample. The meanings of the different colors are presented in (C; bottom left) and the colors used are consistent among the top left, bottom left, bottom middle, and bottom right parts of (C). DFS, disease-free survival; cTNM, clinical tumor staging; WES, whole-exome sequencing; TCR, T-cell receptor; PBMC, peripheral blood mononuclear cells; TMB, tumor mutational burden; DEL, delete; TNP, tri-nucleotide polymorphism; SNP, single nucleotide polymorphism; ONP, oligo-nucleotide polymorphism; INS, insert; DNP, di-nucleotide polymorphism; SNV, single nucleotide variation; SBS, single-base substitution; Ns-NSCLC, non-squamous non-small cell lung cancer; GZMU1H, the First Affiliated Hospital of Guangzhou Medical University.
Figure 2
Figure 2
Genomic biomarker spectrums of Chinese early-stage non-squamous NSCLC. Genomic biomarker landscapes, including TMB, TNB, wCIN, ITH, and HLA-LOH of each patient, with relapse status as an annotation (A). Genomic biomarker spectrum differences concerning smoking status (B), driver gene mutations (C,D,T), and HLA-LOH status (G). Genomic biomarkers of TMB-TNB (E) and TMB-wCIN (F) demonstrated strong correlations. Higher mutation frequencies of TP53 (H), TTN (I), and CSMD1 (J) were observed in patients with HLA-LOH. A higher wCIN-score was discovered in patients with relapse than without (K). Prognostic significance of TMB (L), wCIN (M), and TMB & ITH combination (N) as evaluated by log-rank test. Prognostic effects of genomic biomarkers as evaluated by multivariate Cox regression analysis (O). Mutation frequency of EGFR between stage I patients with or without relapse (P). Disease-free survival differences concerning EGFR mutation status (Q) and mutation subtypes (R). Prognostic effects of EGFR and TMB combination category (S). Co-concurrent variants of PEG3 (U) and STK11 (V) in patients with or without EGFR mutation. The horizontal bar inside the boxes represents the median, and the lower and upper ends of the boxes are the first and third quartiles. Comparison of continuous data by Kruskal-Wallis test, *, P<0.05; **, P<0.01; ***, P<0.001; ns, non-significant. HLA-LOH, human leukocyte antigen loss of heterozygosity; wCIN, whole chromosomal instability; ITH, intratumoral heterogeneity; TMB, tumor mutational burden; TNB, tumor neoantigen burden; MSI, microsatellite instability; DFS, disease-free survival; HR, hazard ratio; CI, confidence interval; wEGFR, wild-type EGFR; mEGFR, mutated EGFR; NSCLC, non-small cell lung cancer.
Figure 3
Figure 3
Transcriptomic spectrums of early-stage non-squamous non-small cell lung cancer. Differentially expressed genes and corresponding enriched pathways of ever-smokers vs. never-smokers (A,B), recurrent vs. disease-free (C,D), stage II–III vs. stage I (E,F), EGFR-mutated vs. wild-type (G,H). The LASSO Cox regression model screened out robust prognosticators among immune-related genes (I,J), and their prognostic effects were evaluated by the univariate Cox regression analysis (K). Red and green dots refer to significantly up-regulated and down-regulated genes, respectively. Black dots represent genes with insignificant changes in expression levels. NK, natural killer; HR, hazard ratio; CI, confidence interval; LASSO, least absolute shrinkage and selection operator.
Figure 4
Figure 4
Immune infiltration landscapes in the tumor nest of early-stage Ns-NSCLC. Immune infiltration differences between different tumor mutational burden levels (A), tumor neoantigen burden levels (B), human leukocyte antigen loss of heterozygosity status (C), EGFR mutation status (D), disease-free survival status (E), and cTNM stage level (F), as evaluated by the CIBERSORT algorithm. The K-means clustering method identified two major immune subtypes of early-stage Ns-NSCLC (G). Comparison of continuous data by Kruskal-Wallis test. *, P<0.05; **, P<0.01; ns, non-significant. NK, natural killer; TMB, tumor mutational burden; TNB, tumor neoantigen burden; HLA-LOH, human leukocyte antigen loss of heterozygosity; Ns-NSCLC, non-squamous non-small cell lung cancer; WT, wild-type; Mut, mutant; DFS, disease-free survival; cTNM, clinical tumor staging.
Figure 5
Figure 5
TCR repertoire spectrums of early-stage Ns-NSCLC. TCR repertoire diversity differences among peripheral blood, tumor, and paratumor samples (A,B). TCR repertoire diversity differences concerning cTNM stage (C), TP53 (D), EGFR (E), and ALK (F) mutation status and smoking status (G). The prognostic effects of TCR repertoire diversity, including d50Index (H), normalizedShannonWienerIndex (I), and inverseSimpsonIndex (J). Comparison of continuous data by Kruskal-Wallis test, *, P<0.05; ***, P<0.001; ns, non-significant. PBMC, peripheral blood mononuclear cells; cTNM, clinical tumor staging; TCR, T-cell receptor; Ns-NSCLC, non-squamous non-small cell lung cancer.
Figure 6
Figure 6
Construction and validation of multi-omics prognostic model based on clinicopathological, genomic, transcriptomic, and T-cell receptors repertoire sequencing data. The predictive accuracy of individual omics/biomarker, including clinicopathological characteristics (A) and genomic biomarkers (B-H). Flowchart demonstrating the process of establishing a multi-omics prognostic model via machine-learning approaches (I). Predictive accuracy of multi-omics prognostic model based on different combination categories in the training and testing cohort (J). Feature importance analyses as evaluated by the Gini index of the RF model combining four omics categories (K). DFS differences predicted by the RF algorithm in the training (L) and testing (M) cohorts. TPR, true positive rate; FPR, false positive rate; cTNM, clinical tumor staging; wCIN, whole chromosomal instability; TMB, tumor mutational burden; CNH, copy number high; MSI, microsatellite instability; ITH, intratumoral heterogeneity; LOH, loss of heterozygosity; AUC, area under the curve; CI, confidence interval; TCR, T-cell receptor; RF, random forest; DFS, disease-free survival.

Similar articles

Cited by

References

    1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. 10.3322/caac.21660 - DOI - PubMed
    1. Miller M, Hanna N. Advances in systemic therapy for non-small cell lung cancer. BMJ 2021;375: n2363.10.1136/bmj.n2363 - DOI - PubMed
    1. Li C, Wang H, Jiang Y, et al. Advances in lung cancer screening and early detection. Cancer Biol Med 2022;19:591-608. 10.20892/j.issn.2095-3941.2021.0690 - DOI - PMC - PubMed
    1. Jonas DE, Reuland DS, Reddy SM, et al. Screening for Lung Cancer With Low-Dose Computed Tomography: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA 2021;325:971-87. 10.1001/jama.2021.0377 - DOI - PubMed
    1. Taylor MD, Nagji AS, Bhamidipati CM, et al. Tumor recurrence after complete resection for non-small cell lung cancer. Ann Thorac Surg 2012;93:1813-20; discussion 1820-1. 10.1016/j.athoracsur.2012.03.031 - DOI - PubMed