. 2021 Mar;70(3):544-554.

doi: 10.1136/gutjnl-2019-319866. Epub 2020 Jul 20.

Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning

Korsuk Sirinukunwattana^{1

2

3}, Enric Domingo⁴, Susan D Richman⁵, Keara L Redmond⁶, Andrew Blake⁷, Clare Verrill^{3

8

9}, Simon J Leedham^{10

11}, Aikaterini Chatzipli¹², Claire Hardy¹², Celina M Whalley¹³, Chieh-Hsi Wu¹⁴, Andrew D Beggs¹⁵, Ultan McDermott¹², Philip D Dunne¹⁶, Angela Meade¹⁷, Steven M Walker¹⁸, Graeme I Murray¹⁹, Leslie Samuel²⁰, Matthew Seymour²¹, Ian Tomlinson^{13

22}, Phil Quirke⁵, Timothy Maughan²³, Jens Rittscher^#^{24

2

3

25}, Viktor H Koelzer^#^{4

26

27}; S:CORT consortium

Affiliations

¹ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK.
² Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK.
³ Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, UK.
⁴ Department of Oncology, University of Oxford, Oxford, UK viktor.koelzer@usz.ch jens.rittscher@eng.ox.ac.uk enric.domingo@oncology.ox.ac.uk.
⁵ Department of Pathology and Tumour Biology, Leeds Institute of Cancer and Pathology, Leeds, UK.
⁶ Centre for Cancer Research and Cell Biology, Faculty of Medicine, Health and Life Sciences, Queen's University Belfast, Belfast, UK.
⁷ Department of Oncology, University of Oxford, Oxford, UK.
⁸ Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
⁹ Nuffield Department of Surgical Sciences and NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK.
¹⁰ Gastrointestinal Stem-cell Biology Laboratory, Oxford Centre for Cancer Gene Research, Wellcome Trust Centre for Human Genetics, Oxford, UK.
¹¹ Translational Gastroenterology Unit, Experimental Medicine Division, Nuffield Department of Clinical Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK.
¹² Wellcome Trust Sanger Institute, Hinxton, UK.
¹³ Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK.
¹⁴ Department of Statistics, University of Oxford, Oxford, UK.
¹⁵ School of Cancer Sciences, University of Birmingham, Birmingham, UK.
¹⁶ Centre for Cancer Research and Cell Biology, Queen's University Belfast, Belfast, UK.
¹⁷ MRC Clinical Trials Unit at University College London, London, UK.
¹⁸ Almac Diagnostics Ltd, Craigavon, UK.
¹⁹ Department of Pathology, School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, UK.
²⁰ Department of Clinical Oncology, Aberdeen Royal Infirmary, Aberdeen, UK.
²¹ Department of Oncology, Leeds Institute of Cancer and Pathology, Leeds, UK.
²² Edinburgh Cancer Centre, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.
²³ CRUK/MRC Oxford Institute for Radiation Oncology, University of Oxford, Oxford, UK.
²⁴ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK viktor.koelzer@usz.ch jens.rittscher@eng.ox.ac.uk enric.domingo@oncology.ox.ac.uk.
²⁵ Ludwig Institute for Cancer Research, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK.
²⁶ Nuffield Department of Medicine, University of Oxford, Oxford, UK.
²⁷ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland.

^# Contributed equally.

PMID: 32690604
PMCID: PMC7873419
DOI: 10.1136/gutjnl-2019-319866

Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning

Korsuk Sirinukunwattana et al. Gut. 2021 Mar.

. 2021 Mar;70(3):544-554.

doi: 10.1136/gutjnl-2019-319866. Epub 2020 Jul 20.

Authors

Affiliations

¹ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK.
² Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK.
³ Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, UK.
⁴ Department of Oncology, University of Oxford, Oxford, UK viktor.koelzer@usz.ch jens.rittscher@eng.ox.ac.uk enric.domingo@oncology.ox.ac.uk.
⁵ Department of Pathology and Tumour Biology, Leeds Institute of Cancer and Pathology, Leeds, UK.
⁶ Centre for Cancer Research and Cell Biology, Faculty of Medicine, Health and Life Sciences, Queen's University Belfast, Belfast, UK.
⁷ Department of Oncology, University of Oxford, Oxford, UK.
⁸ Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
⁹ Nuffield Department of Surgical Sciences and NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK.
¹⁰ Gastrointestinal Stem-cell Biology Laboratory, Oxford Centre for Cancer Gene Research, Wellcome Trust Centre for Human Genetics, Oxford, UK.
¹¹ Translational Gastroenterology Unit, Experimental Medicine Division, Nuffield Department of Clinical Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK.
¹² Wellcome Trust Sanger Institute, Hinxton, UK.
¹³ Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK.
¹⁴ Department of Statistics, University of Oxford, Oxford, UK.
¹⁵ School of Cancer Sciences, University of Birmingham, Birmingham, UK.
¹⁶ Centre for Cancer Research and Cell Biology, Queen's University Belfast, Belfast, UK.
¹⁷ MRC Clinical Trials Unit at University College London, London, UK.
¹⁸ Almac Diagnostics Ltd, Craigavon, UK.
¹⁹ Department of Pathology, School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, UK.
²⁰ Department of Clinical Oncology, Aberdeen Royal Infirmary, Aberdeen, UK.
²¹ Department of Oncology, Leeds Institute of Cancer and Pathology, Leeds, UK.
²² Edinburgh Cancer Centre, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.
²³ CRUK/MRC Oxford Institute for Radiation Oncology, University of Oxford, Oxford, UK.
²⁴ Institute of Biomedical Engineering (IBME), Department of Engineering Science, University of Oxford, Oxford, UK viktor.koelzer@usz.ch jens.rittscher@eng.ox.ac.uk enric.domingo@oncology.ox.ac.uk.
²⁵ Ludwig Institute for Cancer Research, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK.
²⁶ Nuffield Department of Medicine, University of Oxford, Oxford, UK.
²⁷ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland.

^# Contributed equally.

PMID: 32690604
PMCID: PMC7873419
DOI: 10.1136/gutjnl-2019-319866

Abstract

Objective: Complex phenotypes captured on histological slides represent the biological processes at play in individual cancers, but the link to underlying molecular classification has not been clarified or systematised. In colorectal cancer (CRC), histological grading is a poor predictor of disease progression, and consensus molecular subtypes (CMSs) cannot be distinguished without gene expression profiling. We hypothesise that image analysis is a cost-effective tool to associate complex features of tissue organisation with molecular and outcome data and to resolve unclassifiable or heterogeneous cases. In this study, we present an image-based approach to predict CRC CMS from standard H&E sections using deep learning.

Design: Training and evaluation of a neural network were performed using a total of n=1206 tissue sections with comprehensive multi-omic data from three independent datasets (training on FOCUS trial, n=278 patients; test on rectal cancer biopsies, GRAMPIAN cohort, n=144 patients; and The Cancer Genome Atlas (TCGA), n=430 patients). Ground truth CMS calls were ascertained by matching random forest and single sample predictions from CMS classifier.

Results: Image-based CMS (imCMS) accurately classified slides in unseen datasets from TCGA (n=431 slides, AUC)=0.84) and rectal cancer biopsies (n=265 slides, AUC=0.85). imCMS spatially resolved intratumoural heterogeneity and provided secondary calls correlating with bioinformatic prediction from molecular data. imCMS classified samples previously unclassifiable by RNA expression profiling, reproduced the expected correlations with genomic and epigenetic alterations and showed similar prognostic associations as transcriptomic CMS.

Conclusion: This study shows that a prediction of RNA expression classifiers can be made from H&E images, opening the door to simple, cheap and reliable biological stratification within routine workflows.

Keywords: colorectal pathology; computerised image analysis; molecular pathology.

PubMed Disclaimer

Conflict of interest statement

Competing interests: KS and JR are co-founders of University of Oxford spinout Ground Truth Labs

Figures

**Figure 1**
Data, study design and imCMS classification framework. Three independent datasets (FOCUS, TCGA and GRAMPIAN) were used in this study. (A) The distribution of the samples stratified by the CMS calls in each dataset. (B) The FOCUS dataset was primarily used for learning the imCMS discriminative model, while the TCGA and GRAMPIAN datasets were used for testing. (C) Training of the imCMS discriminative model based on the domain adversarial approach. Image tiles were extracted from annotated tumour regions. Tiles from the FOCUS cohort were categorised by CMS class of the original slide and were used to train the model to predict the imCMS classes on unseen datasets. Tiles from the TCGA and GRAMPIAN cohorts were unlabelled and were used together with those from the FOCUS cohort in the cohort (domain) prediction. Domain adversarial training forced the cohort classifier to perform poorly, which in turn encouraged the model to learn indiscriminative features across datasets. Five distinct models were produced. (D) At the inference time, the ensemble of the learnt models predicts the imCMS class for each of the image tiles extracted from annotated tumour regions of a slide. A slide is assigned to the imCMS class with the maximum prediction score (ie, highest number of tiles in the slide). imCMS, image-based consensus molecular subtype; TCGA, The Cancer Genome Atlas.

**Figure 2**
Image-based consensus molecular subtype (imCMS) classification. (A) Receiver operating curves of the imCMS classifier, optimised by the domain adversarial approach, on the FOCUS (n slides=510, 3×), TCGA (n slides=431, 3×) and GRAMPIAN cohorts (n slides=265, 12×). (B) Correspondences between CMS and imCMS classes in different datasets. All samples labelled as unclassified by RNA-based CMS calls were reclassified by imCMS. (C) Examples of image tiles with high prediction confidence for each imCMS class in FOCUS. Histological patterns associated with imCMS1 are mucin and lymphocytic infiltration. In imCMS2, evident cribriform growth patterns and comedo-like necrosis are observed, while imCMS3 is characterised by ectatic, mucin-filled glandular structures in combination with a minor component showing papillary and cribriform morphology. imCMS4 is predominantly associated with infiltrative CRC growth pattern, a prominent desmoplastic stromal reaction and frequent presence of single cell invasion (tumour budding). Scale bar ~1 mm. (D) Molecular associations of the CMS classified samples (black) and the CMS unclassified samples that have been classified by imCMS (grey). The molecular profiles of reclassified samples are largely consistent with those of the classified CMS samples. Statistically significant differences (p<0.05) are marked with a red asterisk. AUC, area under the curve; TCGA, The Cancer Genome Atlas.

**Figure 3**
Intratumoural heterogeneity of the imCMS molecular subtypes. (A) Visualisation of the regional classification of the imCMS classifier. imCMS classification of a tumour sample can exhibit uniform results (left) or a degree of variation in the predicted imCMS class and the level of confidence (right). The colour overlay indicates the imCMS classes and the opacity reflects the classification confidence. (B) Heterogeneity of the CMS and imCMS classification scores. Each bar represents classification scores of a sample, and samples are sorted by the entropy of the prediction scores from the molecular-based random forest CMS classifier. (C) Heterogeneity of the CMS classification. A secondary CMS call was derived by relaxing the classification threshold of the random forest CMS classifier. (D) Cosine similarity between the imCMS and CMS prediction scores, stratified by the primary and secondary CMS calls. The levels of similarity were compared against those produced by a random classifier. Statistical analysis was performed using Wilcoxon rank-sum test, adjusted for the false discovery rate. P value <0.05 was considered statistically significant. n indicates the number of patients. Note that two diagnostic slides (serial sections) were available for the majority of cases in the FOCUS and GRAMPIAN cohorts. In cases where two slides were available, the analyses for each slide were performed separately. Panels (B) and (D) report the results for the first slide. The matched results for the second slide are provided in online supplementary figure S10. imCMS predictions represent the calls made by the domain adversarially trained imCMS classifier. imCMS, image-based consensus molecular subtype; TCGA, The Cancer Genome Atlas.

**Figure 4**
Prognostic associations of the image-based consensus molecular subtypes (imCMSs). Overall survival (OS) outcomes of the FOCUS cohort (n=278 patients, (A)) and TCGA cohort (n=395 patients, (B)), progression-free interval (PFI) outcome of the TCGA cohort (n=395, (C)) and relapse-free survival (RFS) outcome (n=83, (D)) as stratified by the transcriptional-based CMS classification and imCMS classification produced by the domain adversarially trained imCMS classifier. Kaplan–Meier estimator was used to estimate the survival probability, and pairwise log-rank test and univariate Cox proportional hazards regression were performed between CMS groups and imCMS groups. HRs and 95% CI for pairwise comparisons were reported. Test results with p value<0.05 were considered statistically significant. TCGA, The Cancer Genome Atlas.

See this image and copyright information in PMC

References

1. Dienstmann R, Vermeulen L, Guinney J, et al. Consensus molecular subtypes and the evolution of precision medicine in colorectal cancer. Nat Rev Cancer 2017;17:79–92. 10.1038/nrc.2016.126 - DOI - PubMed
1. Van Cutsem E, Köhne C-H, Hitre E, et al. Cetuximab and chemotherapy as initial treatment for metastatic colorectal cancer. N Engl J Med 2009;360:1408–17. 10.1056/NEJMoa0805019 - DOI - PubMed
1. Trusheim MR, Berndt ER, Douglas FL. Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat Rev Drug Discov 2007;6:287–93. 10.1038/nrd2251 - DOI - PubMed
1. Sepulveda AR, Hamilton SR, Allegra CJ, et al. Molecular biomarkers for the evaluation of colorectal cancer: guideline from the American Society for clinical pathology, College of American pathologists, association for molecular pathology, and the American Society of clinical oncology. J Clin Oncol 2017;35:1453–86. 10.1200/JCO.2016.71.9807 - DOI - PubMed
1. Punt CJA, Koopman M, Vermeulen L. From tumour heterogeneity to advances in precision treatment of colorectal cancer. Nat Rev Clin Oncol 2017;14:235–46. 10.1038/nrclinonc.2016.171 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning

Affiliations

Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical