Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 16;6(12):102479.
doi: 10.1016/j.xcrm.2025.102479. Epub 2025 Dec 8.

A multimodal synergistic model for personalized neoadjuvant immunochemotherapy in esophageal cancer

Affiliations

A multimodal synergistic model for personalized neoadjuvant immunochemotherapy in esophageal cancer

Zihan Zhao et al. Cell Rep Med. .

Abstract

Neoadjuvant immunochemotherapy (nICT) has significantly improved the treatment of locally advanced esophageal cancer (EC), yet accurately identifying patients' response remains a major challenge. In this study, we introduce eSPARK, a multimodal framework designed to integrate routinely available clinical data for informed decision-making in nICT treatment for EC. The model is developed using 344 patients from three independent regions, each with pre-treatment-paired computed tomography (CT) imaging and pathological slides, and postoperative pathological complete response (pCR) outcomes. By incorporating cytological semantic information, eSPARK demonstrates superior generalizability, outperforming single-modality models and achieving robust predictive accuracy across multicenter datasets. Additionally, a multi-scale interpretability module identifies several biomarkers, including the neutrophil-to-lymphocyte ratio (NLR) in the tumor microenvironment, associated with nICT response. Our findings underscore the potential of eSPARK as a powerful tool for personalized therapeutic decision-making in locally advanced EC and its broader implications for advancing precision oncology through multidisciplinary data integration.

Keywords: deep learning; esophageal cancer; multimodal; neoadjuvant immunochemotherapy.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Study cohort characteristics and workflow (A) Overview and utilization of the datasets. This study analyzed 344 locally advanced ESCC patients from three distinct regions, with each case accompanied by comprehensive and paired clinical data. The model was trained exclusively on the internal cohort and validated for generalizability across three multicenter cohorts. (B) Heatmap of patient clinical features. (C) Structural features of the eSPARK framework. The framework integrates multi-scale radiological information with cell-informed histological data through multimodal fusion. (D) Workflow of eSPARK in clinical settings. The model integrates routine pre-treatment assessment data to predict therapeutic benefits. Through an interpretable module, it provides clinicians with real-time multi-scale visual reports, potential biomarkers, and multimodal treatment recommendations to support final clinical decision-making. (E) Overview of multi-scale medical reports and AI-assisted biomarker discovery.
Figure 2
Figure 2
Performance of CytoPath (A) Architecture of the CytoPath. The semantic information of the primary cellular components within the TIME of ESCC was integrated with morphological features through the text encoder. (B) The impact of the text-assisted module on the performance of various model architectures. (C) Patch clustering maps matched to major cell types in External-HN (left) and External-ST (right). (D) Receiver operating characteristic (ROC) curves of CytoPath in the external validation cohorts. (E) Comparison of predicted value distributions between responder and non-responder groups in External-HN (top) and External-ST (bottom). Boxplots depicted the 25th, 50th, and 75th percentiles.
Figure 3
Figure 3
Performance of the MScaleCT model (A) Automatic segmentation results of entire esophageal and local ESCC regions in CT images. (B) Univariate logistic regression analysis of the significance of radiomic features from different scale imaging groups in the internal training cohort. (C) Venn diagram of significant features across different scales in the internal training cohort. (D) Structure of MScaleCT, utilizing multi-scale medical imaging for treatment response prediction. (E) Comparison of MScaleCT with single-scale deep learning models and logistic regression models across different datasets. Error bars represent 95% confidence intervals.
Figure 4
Figure 4
Predictive performance of eSPARK (A) This study employs 10-fold cross-validation, integrating the predictions of 10 models using a voting mechanism to obtain the final prediction. (B) ROC curves of eSPARK predictions in External-HN (top) and External-ST (bottom). (C) Distribution of model predictions and comparison of predicted value distributions between responder and non-responder groups in External-HN (left) and External-ST (right). (D) Comparison of AUC between eSPARK and single-modality models in External-HN (top) and External-ST (bottom). (E) Correction of predictions by eSPARK compared to MScaleCT in External-HN (left) and External-ST (right). Boxplots depicted the 25th, 50th, and 75th percentiles. MCT, MScaleCT.
Figure 5
Figure 5
Multi-scale interpretability of CT and pathology modalities (A) Contribution analysis of CT imaging features from ESCC (top) and esophageal (bottom) scales. (B) Multi-scale comparison of CT images between a responder and a non-responder. (C) Attention visualization heatmap at WSI level and high-focus regions. (D) Feature clustering at patch level and automated report generation results.
Figure 6
Figure 6
Discovery of biomarkers at the cellular level (A) Comparison of the contribution of various major cell types in Internal-CC (left), External-HN (middle), and External-ST (right), respectively. (B) Comparison of different immune cell types (left) and tumor cell types (right) in Internal-CC, External-HN, and External-ST, respectively. Low NLR and high differentiation consistently predict better treatment responses across all datasets. (C) Ablation experiment of different cell types in Internal-CC (left), External-HN (middle), External-ST (right), respectively. Lym, lymphocyte; Neo, neutrophil; PD, poor-differentiated tumor cell; WD, well-differentiated tumor cell.

References

    1. Bray F., Laversanne M., Sung H., Ferlay J., Siegel R.L., Soerjomataram I., Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2024;74:229–263. doi: 10.3322/caac.21834. - DOI - PubMed
    1. Sheikh M., Roshandel G., McCormack V., Malekzadeh R. Current Status and Future Prospects for Esophageal Cancer. Cancers. 2023;15 doi: 10.3390/cancers15030765. - DOI - PMC - PubMed
    1. Abnet C.C., Arnold M., Wei W.Q. Epidemiology of Esophageal Squamous Cell Carcinoma. Gastroenterology. 2018;154:360–373. doi: 10.1053/j.gastro.2017.08.023. - DOI - PMC - PubMed
    1. Ilson D.H., van Hillegersberg R. Management of Patients With Adenocarcinoma or Squamous Cancer of the Esophagus. Gastroenterology. 2018;154:437–451. doi: 10.1053/j.gastro.2017.09.048. - DOI - PubMed
    1. Yang H., Wang F., Hallemeier C.L., Lerut T., Fu J. Oesophageal cancer. Lancet. 2024;404:1991–2005. doi: 10.1016/S0140-6736(24)02226-8. - DOI - PubMed