Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr:114:105663.
doi: 10.1016/j.ebiom.2025.105663. Epub 2025 Mar 22.

Deep learning informed multimodal fusion of radiology and pathology to predict outcomes in HPV-associated oropharyngeal squamous cell carcinoma

Affiliations

Deep learning informed multimodal fusion of radiology and pathology to predict outcomes in HPV-associated oropharyngeal squamous cell carcinoma

Bolin Song et al. EBioMedicine. 2025 Apr.

Abstract

Background: We aim to predict outcomes of human papillomavirus (HPV)-associated oropharyngeal squamous cell carcinoma (OPSCC), a subtype of head and neck cancer characterized with improved clinical outcome and better response to therapy. Pathology and radiology focused AI-based prognostic models have been independently developed for OPSCC, but their integration incorporating both primary tumour (PT) and metastatic cervical lymph node (LN) remains unexamined.

Methods: We investigate the prognostic value of an AI approach termed the swintransformer-based multimodal and multi-region data fusion framework (SMuRF). SMuRF integrates features from CT corresponding to the PT and LN, as well as whole slide pathology images from the PT as a predictor of survival and tumour grade in HPV-associated OPSCC. SMuRF employs cross-modality and cross-region window based multi-head self-attention mechanisms to capture interactions between features across tumour habitats and image scales.

Findings: Developed and tested on a cohort of 277 patients with OPSCC with matched radiology and pathology images, SMuRF demonstrated strong performance (C-index = 0.81 for DFS prediction and AUC = 0.75 for tumour grade classification) and emerged as an independent prognostic biomarker for DFS (hazard ratio [HR] = 17, 95% confidence interval [CI], 4.9-58, p < 0.0001) and tumour grade (odds ratio [OR] = 3.7, 95% CI, 1.4-10.5, p = 0.01) controlling for other clinical variables (i.e., T-, N-stage, age, smoking, sex and treatment modalities). Importantly, SMuRF outperformed unimodal models derived from radiology or pathology alone.

Interpretation: Our findings underscore the potential of multimodal deep learning in accurately stratifying OPSCC risk, informing tailored treatment strategies and potentially refining existing treatment algorithms.

Funding: The National Institutes of Health, the U.S. Department of Veterans Affairs and National Institute of Biomedical Imaging and Bioengineering.

Keywords: Deep learning; Multimodal biomarker; Oropharyngeal cancer; Pathology; Radiology.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests Dr. Madabhushi is an equity holder in Picture Health, Elucid Bioimaging, and Inspirata Inc. Currently he serves on the advisory board of Picture Health, and SimBioSys. He currently consults for Takeda Inc. He also has sponsored research agreements with AstraZeneca and Bristol Myers-Squibb. His technology has been licenced to Picture Health and Elucid Bioimaging. He is also involved in 2 different R01 grants with Inspirata Inc. He also serves as a member for the Frederick National Laboratory Advisory Committee. Dr. Kailin Yang was supported by RSNA Research Fellow Grant and ASTRO-LUNGevity Foundation Radiation Oncology Seed Grant.

Figures

Fig. 1
Fig. 1
Flowchart of this study: a) multimodal data curation and annotation; b) preprocessing on pathology WSI, fragment contours (green) are generated using the CLAM toolbox, tumour annotations (red) are provided by pathologists. On radiology CT, primary tumour (yellow) and metastatic cervical lymph node annotations (blue) are provided by radiologists. c) multi-region and multiscale fusion with SwinT; d) model inference: survival and grade predictions. Red regions on CT and WSI indicate the prognostic relevant regions that the model is focusing on. W-MSA: window-based multi-head self-attention; SW-MSA: shifted window multi-head self-attention; SwinT: swin-transformer; HIPT: Hierarchical Image Pyramid Transformer.
Fig. 2
Fig. 2
The Kaplan–Meier survival analysis of SMuRF for OPSCC DFS stratification from training set (a), validation set (b) and test set (c) and for grade classification from training set (d), validation set (e) and test set (f). Model comparisons of 7 comparable models on the test set (g) accounting for input modalities and cancer habitat regions are considered. Model comparisons using 4 different fusion schemes is performed on the test set (h).
Fig. 3
Fig. 3
Multivariable Cox regression analysis using DFS as endpoint and multivariable logistic regression using grade as endpoint on the test set (a). Beeswarm plot of SHAP variable importance in the multivariable Cox regression analysis (b) and the multivariable logistic regression analysis (d). Mean SHAP values were converted to proportion for each variable, quantifying their contributions to the DFS predictions (c) and the grade classification (e).
Fig. 4
Fig. 4
SMuRF histogram distributions on validation and test sets (a), and four representative patient examples of clinical information (b), cropped CT scans with primary tumour (c) and metastatic cervical lymph node annotations (e). Corresponding integrated gradient (IG) attention maps (d, f) highlighted the important regions for predictions. The IG results shown that the deep learning model focused regions within the primary tumour and metastatic cervical lymph nodes.
Fig. 5
Fig. 5
One example high-SMuRF (a) and one example low-SMuRF (b) pathology WSIs with primary tumour annotations (red boundaries) and IG overlaid to highlight prognostic relevant regions (bottom left) detected by SMuRF. For each pathology WSI, two example 2048 × 2048 regions with corresponding attention heatmaps (c, d) within the highlighted prognostically relevant regions are provided. The attention heatmaps of HIPT model at a patch size of 256 (macro-scale) expressed differently for the high- and low-SMuRF patients: there are more condensed high attention areas related to the tumour-collagen fibre interface (c) for the high-SMuRF WSI than the low-SMuRF WSI, which contains primarily the tumour cell clusters (d). The attention heatmaps at a patch size of 16 (micro-scale) highlighted individual collagen fibre (e) for high-SMuRF while emphasized mainly the tumour cells (f) for low-SMuRF WSI. Presence of individual morphologic hallmarks (i.e., tumour-collagen fibre interface, tumour cell clusters) were evaluated and confirmed by a pathologist (T.P.).

References

    1. Lechner M., Liu J., Masterson L., Fenton T.R. HPV-associated oropharyngeal cancer: epidemiology, molecular biology and clinical management. Nat Rev Clin Oncol. 2022;19(5):306–327. doi: 10.1038/s41571-022-00603-7. - DOI - PMC - PubMed
    1. Craig S.G., Anderson L.A., Schache A.G., et al. Recommendations for determining HPV status in patients with oropharyngeal cancers under TNM8 guidelines: a two-tier approach. Br J Cancer. 2019;120(8):827–833. doi: 10.1038/s41416-019-0414-9. - DOI - PMC - PubMed
    1. Amin D.R., Philips R., Bertoni D.G., et al. Differences in functional and survival outcomes between patients receiving primary surgery vs chemoradiation therapy for treatment of T1-T2 oropharyngeal squamous cell carcinoma. JAMA Otolaryngol Neck Surg. 2023;149(11):980–986. doi: 10.1001/jamaoto.2023.1944. - DOI - PMC - PubMed
    1. Kim M.H., Kim J.-H., Lee J.M., et al. Molecular subtypes of oropharyngeal cancer show distinct immune microenvironment related with immune checkpoint blockade response. Br J Cancer. 2020;122(11):1649–1660. doi: 10.1038/s41416-020-0796-8. - DOI - PMC - PubMed
    1. Beaty B.T., Moon D.H., Shen C.J., et al. PIK3CA mutation in HPV-associated OPSCC patients receiving deintensified chemoradiation. J Natl Cancer Inst. 2020;112(8):855–858. doi: 10.1093/jnci/djz224. - DOI - PubMed

MeSH terms

LinkOut - more resources