Assessing genotype-phenotype correlations in colorectal cancer with deep learning: a multicentre cohort study
- PMID: 40829965
- DOI: 10.1016/j.landig.2025.100891
Assessing genotype-phenotype correlations in colorectal cancer with deep learning: a multicentre cohort study
Abstract
Background: Deep learning-based models enable the prediction of molecular biomarkers from histopathology slides of colorectal cancer stained with haematoxylin and eosin; however, few studies have assessed prediction targets beyond microsatellite instability (MSI), BRAF, and KRAS systematically. We aimed to develop and validate a multi-target model based on deep learning for the simultaneous prediction of numerous genetic alterations and their associated phenotypes in colorectal cancer.
Methods: In this multicentre cohort study, tissue samples from patients with colorectal cancer were obtained by surgical resection and stained with haematoxylin and eosin. These samples were then digitised into whole-slide images and used to train and test a transformer-based deep learning algorithm for biomarker detection to simultaneously predict multiple genetic alterations and provide heatmap explanations. The primary dataset comprised 1376 patients from five cohorts who underwent comprehensive panel sequencing, with an additional 536 patients from two public datasets for validation. We compared the model's performance against conventional single-target models and examined the co-occurrence of alterations and shared morphology.
Findings: The multi-target model was able to predict numerous biomarkers from pathology slides, matching and partly exceeding single-target transformers. In the primary external validation cohorts, mean area under the receiver operating characteristic curve (AUROC) for the multi-target transformer was 0·78 (SD 0·01) for BRAF, 0·88 (0·01) for hypermutation, 0·93 (0·01) for MSI, and 0·86 (0·01) for RNF43; predictive performance was consistent across metrics and supported by co-occurrence analyses. However, biomarkers with high AUROCs largely correlated with MSI, with model predictions depending considerably on morphology associated with MSI at pathological examination.
Interpretation: By use of morphology associated with MSI and more subtle biomarker-specific patterns within a shared phenotype, the multi-target transformers efficiently predicted biomarker status for diverse genetic alterations in colorectal cancer from slides stained with haematoxylin and eosin. These results highlight the importance of considering mutational co-occurrence and common morphology in biomarker research based on deep learning. Our validated and scalable model could support extension to other cancers and large, diverse cohorts, potentially facilitating cost-effective pre-screening and streamlined diagnostics in precision oncology.
Funding: German Federal Ministry of Health, Max-Eder-Programme of German Cancer Aid, German Federal Ministry of Education and Research, German Academic Exchange Service, and the EU.
Copyright © 2025 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Conflict of interest statement
Declaration of interests JNK declares consulting services for Bioptimus, Owkin, DoMore Diagnostics, Panakeia, AstraZeneca, Mindpeak, and MultiplexDx. He also holds shares in StratifAI GmbH and Synagen GmbH; has received a research grant from GSK; and has received honoraria from AstraZeneca, Bayer, Daiichi Sankyo, Eisai, Janssen, Merck, MSD, Bristol Myers Squibb, Roche, Pfizer, and Fresenius. MG has received honoraria for lectures sponsored by Techniker Krankenkasse and AstraZeneca. NGR has received travel compensation from Bruker Spatial Biology (formerly NanoString); and holds stocks for Bayer, Pfizer, AstraZeneca, Roche, and Abivax. SF has received honoraria for lectures from Bristol Myers Squibb and MSD. UP reports consulting services for AbbVie. UP’s husband holds individual stocks for Arm Holdings, BYD Company, Crowdstrike Holdings, Stellantis, BioNTech SE ADR, Amazon, CureVac, NanoString Technologies, Google–Alphabet Class C, Nvidia, and Microsoft. All other authors declare no competing interests.
Update of
-
Assessing Genotype-Phenotype Correlations with Deep Learning in Colorectal Cancer: A Multi-Centric Study.medRxiv [Preprint]. 2025 Feb 8:2025.02.04.25321660. doi: 10.1101/2025.02.04.25321660. medRxiv. 2025. Update in: Lancet Digit Health. 2025 Aug;7(8):100891. doi: 10.1016/j.landig.2025.100891. PMID: 39973981 Free PMC article. Updated. Preprint.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Research Materials
Miscellaneous