. 2021 Dec;3(12):e763-e772.

doi: 10.1016/S2589-7500(21)00180-1. Epub 2021 Oct 19.

Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study

Mohsin Bilal¹, Shan E Ahmed Raza¹, Ayesha Azam², Simon Graham¹, Mohammad Ilyas³, Ian A Cree⁴, David Snead⁵, Fayyaz Minhas¹, Nasir M Rajpoot⁶

Affiliations

¹ Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK.
² Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK; Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.
³ Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham, UK.
⁴ International Agency for Research on Cancer, Lyon, France.
⁵ Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.
⁶ Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK; Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK. Electronic address: n.m.rajpoot@warwick.ac.uk.

PMID: 34686474
PMCID: PMC8609154
DOI: 10.1016/S2589-7500(21)00180-1

Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study

Mohsin Bilal et al. Lancet Digit Health. 2021 Dec.

. 2021 Dec;3(12):e763-e772.

doi: 10.1016/S2589-7500(21)00180-1. Epub 2021 Oct 19.

Authors

Mohsin Bilal¹, Shan E Ahmed Raza¹, Ayesha Azam², Simon Graham¹, Mohammad Ilyas³, Ian A Cree⁴, David Snead⁵, Fayyaz Minhas¹, Nasir M Rajpoot⁶

Affiliations

¹ Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK.
² Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK; Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.
³ Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham, UK.
⁴ International Agency for Research on Cancer, Lyon, France.
⁵ Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.
⁶ Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coventry, UK; Department of Pathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK. Electronic address: n.m.rajpoot@warwick.ac.uk.

PMID: 34686474
PMCID: PMC8609154
DOI: 10.1016/S2589-7500(21)00180-1

Abstract

Background: Determining the status of molecular pathways and key mutations in colorectal cancer is crucial for optimal therapeutic decision making. We therefore aimed to develop a novel deep learning pipeline to predict the status of key molecular pathways and mutations from whole-slide images of haematoxylin and eosin-stained colorectal cancer slides as an alternative to current tests.

Methods: In this retrospective study, we used 502 diagnostic slides of primary colorectal tumours from 499 patients in The Cancer Genome Atlas colon and rectal cancer (TCGA-CRC-DX) cohort and developed a weakly supervised deep learning framework involving three separate convolutional neural network models. Whole-slide images were divided into equally sized tiles and model 1 (ResNet18) extracted tumour tiles from non-tumour tiles. These tumour tiles were inputted into model 2 (adapted ResNet34), trained by iterative draw and rank sampling to calculate a prediction score for each tile that represented the likelihood of a tile belonging to the molecular labels of high mutation density (vs low mutation density), microsatellite instability (vs microsatellite stability), chromosomal instability (vs genomic stability), CpG island methylator phenotype (CIMP)-high (vs CIMP-low), BRAF^mut (vs BRAF^WT), TP53^mut (vs TP53^WT), and KRAS^WT (vs KRAS^mut). These scores were used to identify the top-ranked titles from each slide, and model 3 (HoVer-Net) segmented and classified the different types of cell nuclei in these tiles. We calculated the area under the convex hull of the receiver operating characteristic curve (AUROC) as a model performance measure and compared our results with those of previously published methods.

Findings: Our iterative draw and rank sampling method yielded mean AUROCs for the prediction of hypermutation (0·81 [SD 0·03] vs 0·71), microsatellite instability (0·86 [0·04] vs 0·74), chromosomal instability (0·83 [0·02] vs 0·73), BRAF^mut (0·79 [0·01] vs 0·66), and TP53^mut (0·73 [0·02] vs 0·64) in the TCGA-CRC-DX cohort that were higher than those from previously published methods, and an AUROC for KRAS^mut that was similar to previously reported methods (0·60 [SD 0·04] vs 0·60). Mean AUROC for predicting CIMP-high status was 0·79 (SD 0·05). We found high proportions of tumour-infiltrating lymphocytes and necrotic tumour cells to be associated with microsatellite instability, and high proportions of tumour-infiltrating lymphocytes and a low proportion of necrotic tumour cells to be associated with hypermutation.

Interpretation: After large-scale validation, our proposed algorithm for predicting clinically important mutations and molecular pathways, such as microsatellite instability, in colorectal cancer could be used to stratify patients for targeted therapies with potentially lower costs and quicker turnaround times than sequencing-based or immunohistochemistry-based approaches.

Funding: The UK Medical Research Council.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests MI reports grants from Roche, outside the submitted work. DS reports personal fees from Royal Philips, outside the submitted work. NMR reports research funding from GlaxoSmithKline and is also part of the PathLAKE consortium, which is partly funded by Royal Philips. All other authors declare no competing interests.

Figures

**Figure 1**
IDaRS prediction pipeline and histopathological feature discovery of colorectal cancer pathways (A) Tissue segmentation and tile extraction were performed to obtain informative tiles. Model 1 (ResNet18) was trained to separate tumour from non-tumour tiles. These tiles served as input to iterative draw and rank sampling (an adaptation of ResNet34; model 2), which was trained on tumour tiles for label prediction. (B) A concept diagram of iterative draw and rank sampling illustrating the training strategy for the fast labelling of whole-slide images. The deep learning model was trained iteratively for classification with a random draw (d_i) of the same number of tiles from each whole-slide image and the k top-ranked tiles of the same slide drawn in the previous iteration. (C) The trained iterative draw and rank sampling model gave a prediction score to each tile in the whole-slide image, which were used to obtain a slide score and identify the top-ranked tiles from each slide. (D) Model 3 (HoVer-Net) inference was used to segment and classify different types of nuclei in top-ranked representative tiles in a cellular composition analysis of colorectal cancer pathways. Histological patterns of the molecular characteristics of colorectal cancers are shown as a spider plot based on the feature importance of different cellular composition profiles modelled via a support vector machine. H&E=haematoxylin and eosin. IDaRS=iterative draw and rank sampling. NEP1=neoplastic epithelial type 1. NEP2=neoplastic epithelial type 2.

**Figure 2**
Iterative draw and rank sampling-based prediction of colorectal cancer pathways in the TCGA-CRC-DX cohort AUROC plots of four-fold cross-validation for prediction of hypermutation (A), microsatellite instability (B), chromosomal instability (C), CpG island methylator phenotype (D), *BRAF* mutation status (E), and *TP53* mutation status (F). The true positive rate represents sensitivity and the false positive rate represents 1–specificity. The blue shaded areas represent the SD. AUROC=area under the convex hull of the receiver operating characteristic curve. TCGA-CRC-DX=The Cancer Genome Atlas colon and rectal cancer.

**Figure 3**
Spider chart of differential cellular compositions as histological features of colorectal cancer pathways Normalised weights between –1 and 1 show the size of significance of the corresponding histological feature. CIMP-high=CpG island methylator phenotype of high frequencies of DNA hypermethylation. NEP1=neoplastic epithelial type 1. NEP2=neoplastic epithelial type 2.

See this image and copyright information in PMC

Comment in

Towards computationally efficient prediction of molecular signatures from routine histology images.
Lafarge MW, Koelzer VH. Lafarge MW, et al. Lancet Digit Health. 2021 Dec;3(12):e752-e753. doi: 10.1016/S2589-7500(21)00232-6. Epub 2021 Oct 19. Lancet Digit Health. 2021. PMID: 34686475 No abstract available.

References

1. Liu Y, Sethi NS, Hinoue T, et al. Comparative molecular analysis of gastrointestinal adenocarcinomas. Cancer Cell. 2018;33:721–735.e8. - PMC - PubMed
1. Pino MS, Chung DC. The chromosomal instability pathway in colon cancer. Gastroenterology. 2010;138:2059–2072. - PMC - PubMed
1. Singh MP, Rai S, Pandey A, Singh NK, Srivastava S. Molecular subtypes of colorectal cancer: an emerging therapeutic opportunity for personalized medicine. Genes Dis. 2019;8:133–145. - PMC - PubMed
1. Kather JN, Pearson AT, Halama N, et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med. 2019;25:1054–1056. - PMC - PubMed
1. Al-Sohaily S, Biankin A, Leong R, Kohonen-Corish M, Warusavitarne J. Molecular pathways in colorectal cancer. J Gastroenterol Hepatol. 2012;27:1423–1431. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study

Affiliations

Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Research Materials

Miscellaneous