Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 4;16(1):7163.
doi: 10.1038/s41467-025-62386-6.

A multimodal dataset for precision oncology in head and neck cancer

Affiliations

A multimodal dataset for precision oncology in head and neck cancer

Marion Dörrich et al. Nat Commun. .

Abstract

Head and neck cancer is a common disease and is associated with a poor prognosis. A promising approach to improving patient outcomes is personalized treatment, which uses information from a variety of modalities. However, only little progress has been made due to the lack of large public datasets. We present a multimodal dataset, HANCOCK, that comprises monocentric, real-world data of 763 head and neck cancer patients. Our dataset contains demographical, pathological, and blood data as well as surgery reports and histologic images, that can be explored in a low-dimensional representation. We can show that combining these modalities using machine learning is superior to a single modality and the integration of imaging data using foundation models helps in endpoint prediction. We believe that HANCOCK will not only open new insights into head and neck cancer pathology but also serve as a major source for researching multimodal machine-learning methodologies in precision oncology.

PubMed Disclaimer

Conflict of interest statement

Competing interests: A.H. declares general disclosures (Honoraria for lectures or consulting/advisory boards for AbbVie, AstraZeneca, Biocartis, BMS, Boehringer Ingelheim, Cepheid, Diaceutics, Gilead, Illumina, Ipsen, Janssen, Lilly, Merck, MSD, Novartis, Pfizer, QUIP GmbH, and other research support from AstraZeneca, Biocartis, Cepheid, Gilead, Illumina, Janssen, Novartis, Owkin, Qiagen, QUIP GmbH). ME declares general disclosures (Personal fees, travel costs, and speaker’s honoraria from Zytomed Systems, Merck, Eisai, MSD, AstraZeneca, Janssen-Cilag, Cepheid, Roche, Astellas, Diaceutics, Owkin, BMS, BicycleTX, QuiP GmbH; research funding from AstraZeneca, Janssen-Cilag, STRATIFYER, Cepheid, Roche, Gilead, Owkin, QUIP GmbH, BicycleTX; advisory roles for Ferring, Diaceutics, MSD, AstraZeneca, Janssen-Cilag, GenomicHealth, Owkin, BMS, BicycleTX, Merck; member of the clinical advisory board of BicycleTX; stock ownership: BicycleTX.). All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the multimodal head and neck cancer dataset.
A Data sources. For cancer diagnosis, demographics were assessed, and blood tests were performed. In the ablative surgery, tissue samples were obtained, and the pathological report was written. The dataset also features information about the treatment choice, events, and survival. B Image data of a patient. Shown are Whole Slide Images of the primary tumor and lymph node with hematoxylin and eosin (HE) staining and Tissue Microarray cores from the tumor center and invasion front with HE and immunohistochemistry (IHC) staining. Scale bar as indicated (1 cm for WSI and 1 mm for TMAs). C Demographical data, shown as the number of patients per sex, smoking status, and age at initial diagnosis. D Laboratory data. Shown is the number of patients for which each parameter is available. The colors indicate values inside or outside of the normal range. E Primary tumor site or CUP (cancer of unknown primary) and grading from the pathology report. HPV-associated carcinoma was not graded. F Number of words in each German surgery report grouped by pathological T stage (N = 742 in total). Boxplots show Q1–Q3 interval with median, whiskers are 1.5 × the inter-quartile (Q1–Q3) range. G Kaplan-Meier plot of overall survival with 95% confidence interval shown as shaded error. The icons for demographics, surgery reports, therapy, and event data are CC BY licensed from Font Awesome. Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Multimodal embeddings.
A For each patient, information from distinct modalities were encoded and concatenated to multimodal patient vectors. B We applied Uniform Manifold Approximation and Projection (UMAP) to visualize the vectors in 2D, and we implemented a genetic algorithm to create two test datasets, one in the distribution of the training data and one out of the distribution. C Visualization of two-dimensional embeddings, colored by features of the encoded data. D UMAP plots of three different train-test splits (E) Receiver-operating characteristics (ROC) curves of a Random Forest classifier for the three splits and two prediction tasks. The mean values and standard deviations of the ROC curves and Area Under the Curve (AUC) scores are shown. The colors correspond to the different splits in (D). The icons for demographics and ICD codes are CC BY licensed from Font Awesome. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Multimodal multiple instance learning allows the prediction of targets using imaging data.
A Multiple instance learning (MIL) pipeline. Tissue in WSIs is segmented and subsequently sampled in patches. These patches are encoded, for example, using the UNI architecture, and used as input for MIL together with a specified target. Using the CLAM framework, we can retrieve attention scores (blue: low attention, red: high attention). Scale bar is 1 cm for the WSI and its attention-labeled counterpart. B Slide-level AUC values for localization prediction on the test dataset (N = 10 each). Color-coded for supervised (blue) and self-supervised (red) encoding backbones. All backbones are based on convolutional neural networks, whereas UNI is based on vision transformers. Boxplots show Q1–Q3 interval with median, whiskers are 1.5 × the inter-quartile (Q1–Q3) range. C Most attended patches for three test WSIs for all localizations tested (oropharynx, larynx and oral cavity). Note the presence of gland tissue in oral cavity-derived samples. Scale bar indicates 30 μm. D Multimodal integration of different imaging data sources. We use separate encodings for WSIs (pink) and TMAs (green) using the UNI encoder for MIL. E Slide-level AUC values for survival prediction on the test dataset (N = 10 each). Boxplots show Q1–Q3 interval with median, whiskers are 1.5 × the inter-quartile (Q1–Q3) range. Scale bar for WSI indicates 1 cm, for TMAs 1 mm. F Attention scores and their frequency across information-containing groups and modalities for the test dataset. Source data are provided as a Source Data file.

References

    1. Sung, H. et al. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J. Clin.71, 209–249 (2021). - PubMed
    1. Johnson, D. E. et al. Head and neck squamous cell carcinoma. Nat. Rev. Dis. Primers6, 92 (2020). - PMC - PubMed
    1. Budach, V. & Tinhofer, I. Novel prognostic clinical factors and biomarkers for outcome prediction in head and neck cancer: a systematic review. Lancet Oncol.20, 313–326 (2019). - PubMed
    1. Gatta, G. et al. Prognoses and improvement for head and neck cancers diagnosed in europe in early 2000s: The eurocare-5 population-based study. Eur. J. Cancer51, 2130–2143 (2015). - PubMed
    1. Chow, L. Q. Head and neck cancer. N. Engl. J. Med.382, 60–72 (2020). - PubMed

LinkOut - more resources