Deep learning informed multimodal fusion of radiology and pathology to predict outcomes in HPV-associated oropharyngeal squamous cell carcinoma
- PMID: 40121941
- PMCID: PMC11979917
- DOI: 10.1016/j.ebiom.2025.105663
Deep learning informed multimodal fusion of radiology and pathology to predict outcomes in HPV-associated oropharyngeal squamous cell carcinoma
Abstract
Background: We aim to predict outcomes of human papillomavirus (HPV)-associated oropharyngeal squamous cell carcinoma (OPSCC), a subtype of head and neck cancer characterized with improved clinical outcome and better response to therapy. Pathology and radiology focused AI-based prognostic models have been independently developed for OPSCC, but their integration incorporating both primary tumour (PT) and metastatic cervical lymph node (LN) remains unexamined.
Methods: We investigate the prognostic value of an AI approach termed the swintransformer-based multimodal and multi-region data fusion framework (SMuRF). SMuRF integrates features from CT corresponding to the PT and LN, as well as whole slide pathology images from the PT as a predictor of survival and tumour grade in HPV-associated OPSCC. SMuRF employs cross-modality and cross-region window based multi-head self-attention mechanisms to capture interactions between features across tumour habitats and image scales.
Findings: Developed and tested on a cohort of 277 patients with OPSCC with matched radiology and pathology images, SMuRF demonstrated strong performance (C-index = 0.81 for DFS prediction and AUC = 0.75 for tumour grade classification) and emerged as an independent prognostic biomarker for DFS (hazard ratio [HR] = 17, 95% confidence interval [CI], 4.9-58, p < 0.0001) and tumour grade (odds ratio [OR] = 3.7, 95% CI, 1.4-10.5, p = 0.01) controlling for other clinical variables (i.e., T-, N-stage, age, smoking, sex and treatment modalities). Importantly, SMuRF outperformed unimodal models derived from radiology or pathology alone.
Interpretation: Our findings underscore the potential of multimodal deep learning in accurately stratifying OPSCC risk, informing tailored treatment strategies and potentially refining existing treatment algorithms.
Funding: The National Institutes of Health, the U.S. Department of Veterans Affairs and National Institute of Biomedical Imaging and Bioengineering.
Keywords: Deep learning; Multimodal biomarker; Oropharyngeal cancer; Pathology; Radiology.
Published by Elsevier B.V.
Conflict of interest statement
Declaration of interests Dr. Madabhushi is an equity holder in Picture Health, Elucid Bioimaging, and Inspirata Inc. Currently he serves on the advisory board of Picture Health, and SimBioSys. He currently consults for Takeda Inc. He also has sponsored research agreements with AstraZeneca and Bristol Myers-Squibb. His technology has been licenced to Picture Health and Elucid Bioimaging. He is also involved in 2 different R01 grants with Inspirata Inc. He also serves as a member for the Frederick National Laboratory Advisory Committee. Dr. Kailin Yang was supported by RSNA Research Fellow Grant and ASTRO-LUNGevity Foundation Radiation Oncology Seed Grant.
Figures
References
-
- Amin D.R., Philips R., Bertoni D.G., et al. Differences in functional and survival outcomes between patients receiving primary surgery vs chemoradiation therapy for treatment of T1-T2 oropharyngeal squamous cell carcinoma. JAMA Otolaryngol Neck Surg. 2023;149(11):980–986. doi: 10.1001/jamaoto.2023.1944. - DOI - PMC - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
