Observational Study

. 2018 Aug 13;8(1):12054.

doi: 10.1038/s41598-018-30535-1.

Automated Gleason grading of prostate cancer tissue microarrays via deep learning

Eirini Arvaniti^#^{1

2}, Kim S Fricker^#³, Michael Moret¹, Niels Rupp³, Thomas Hermanns⁴, Christian Fankhauser⁴, Norbert Wey³, Peter J Wild^{3

5}, Jan H Rüschoff⁶, Manfred Claassen^{7

8}

Affiliations

¹ Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
² Swiss Institute of Bioinformatics (SIB), Zurich, Switzerland.
³ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland.
⁴ Department of Urology, University of Zurich, Zurich, Switzerland.
⁵ Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt, Germany.
⁶ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland. JanHendrik.Rueschoff@usz.ch.
⁷ Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland. claassen@imsb.biol.ethz.ch.
⁸ Swiss Institute of Bioinformatics (SIB), Zurich, Switzerland. claassen@imsb.biol.ethz.ch.

^# Contributed equally.

PMID: 30104757
PMCID: PMC6089889
DOI: 10.1038/s41598-018-30535-1

Observational Study

Automated Gleason grading of prostate cancer tissue microarrays via deep learning

Eirini Arvaniti et al. Sci Rep. 2018.

. 2018 Aug 13;8(1):12054.

doi: 10.1038/s41598-018-30535-1.

Authors

Affiliations

¹ Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
² Swiss Institute of Bioinformatics (SIB), Zurich, Switzerland.
³ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland.
⁴ Department of Urology, University of Zurich, Zurich, Switzerland.
⁵ Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt, Germany.
⁶ Department of Pathology and Molecular Pathology, University of Zurich, Zurich, Switzerland. JanHendrik.Rueschoff@usz.ch.
⁷ Institute for Molecular Systems Biology, ETH Zurich, Zurich, Switzerland. claassen@imsb.biol.ethz.ch.
⁸ Swiss Institute of Bioinformatics (SIB), Zurich, Switzerland. claassen@imsb.biol.ethz.ch.

^# Contributed equally.

PMID: 30104757
PMCID: PMC6089889
DOI: 10.1038/s41598-018-30535-1

Erratum in

Author Correction: Automated Gleason grading of prostate cancer tissue microarrays via deep learning.
Arvaniti E, Fricker KS, Moret M, Rupp N, Hermanns T, Fankhauser C, Wey N, Wild PJ, Rüschoff JH, Claassen M. Arvaniti E, et al. Sci Rep. 2019 May 16;9(1):7668. doi: 10.1038/s41598-019-43989-8. Sci Rep. 2019. PMID: 31092857 Free PMC article.
Author Correction: Automated Gleason grading of prostate cancer tissue microarrays via deep learning.
Arvaniti E, Fricker KS, Moret M, Rupp N, Hermanns T, Fankhauser C, Wey N, Wild PJ, Rüschoff JH, Claassen M. Arvaniti E, et al. Sci Rep. 2021 Nov 23;11(1):23032. doi: 10.1038/s41598-021-02195-1. Sci Rep. 2021. PMID: 34815456 Free PMC article. No abstract available.

Abstract

The Gleason grading system remains the most powerful prognostic predictor for patients with prostate cancer since the 1960s. Its application requires highly-trained pathologists, is tedious and yet suffers from limited inter-pathologist reproducibility, especially for the intermediate Gleason score 7. Automated annotation procedures constitute a viable solution to remedy these limitations. In this study, we present a deep learning approach for automated Gleason grading of prostate cancer tissue microarrays with Hematoxylin and Eosin (H&E) staining. Our system was trained using detailed Gleason annotations on a discovery cohort of 641 patients and was then evaluated on an independent test cohort of 245 patients annotated by two pathologists. On the test cohort, the inter-annotator agreements between the model and each pathologist, quantified via Cohen's quadratic kappa statistic, were 0.75 and 0.71 respectively, comparable with the inter-pathologist agreement (kappa = 0.71). Furthermore, the model's Gleason score assignments achieved pathology expert-level stratification of patients into prognostically distinct groups, on the basis of disease-specific survival data available for the test cohort. Overall, our study shows promising results regarding the applicability of deep learning-based solutions towards more objective and reproducible prostate cancer grading, especially for cases with heterogeneous Gleason patterns.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Overall annotation procedure. **(a)** Examples of TMA spot Gleason annotations provided by the pathologists (blue: Gleason pattern 3 region, yellow: Gleason pattern 4 region, red: Gleason pattern 5 region). **(b)** During the training phase (top row), a deep neural network was trained as a patch-level classifier. We used the MobileNet architecture, whose main building blocks are “depthwise separable” convolutions: a special type of convolution block with considerably fewer parameters than normal convolutions. Convolution blocks are used to extract increasingly complex features from the input image. Following the convolution blocks, a global average pooling layer computes the spatial average of each feature map at the last convolution layer, effectively summarizing the locally-detected patterns across the entire image. Finally, the output layer produced the final classification decision for each input image patch by computing a probability distribution over the four Gleason classes considered in this study. During the evaluation phase (bottom row), the trained patch-level convolutional neural network was applied to entire TMA spot images in a sliding window fashion and generated pixel-level probability maps for each class. A Gleason score was assigned to a TMA spot as the sum of the primary and secondary Gleason patterns detected (above a threshold) in the corresponding output pixel-level maps.

**Figure 2**
Model evaluation on test cohort (image patch level) and inter-pathologist variability. All confusion matrices were normalized per row (ground truth label) reflecting the recall metric for each class. **(a)** Patch-based model annotations compared with annotations by 1st pathologist. **(b)** Patch-based model annotations compared with annotations by 2nd pathologist. **(c)** Annotations by 2nd pathologist compared with annotations by 1st pathologist. **(d)** Venn diagrams illustrating the overlap in patch-level Gleason annotations produced by the deep learning model and the two pathologists.

**Figure 3**
Representative examples of model predictions as pixel-level probability maps and visual comparison with pathologist annotations. Each subfigure (a–d) corresponds to a different TMA spot. Within each subfigure **(a–d)**, the subplots in the right-most column show the Gleason patterns assigned by the two pathologists (blue: Gleason 3 region, yellow: Gleason 4 region, red: Gleason 5 region). The other four subplots show the model’s Gleason annotations. **(a)** The annotation of the model agrees overall with the two pathologists, except for a small tissue region in the upper part which is marked as Gleason pattern 3 exclusively by the model. Retrospective assessment of this part by the pathologists confirmed the presence of a small focus of atypical glands. **(b)** The model and pathologist annotations agree on Gleason pattern 4. **(c)** Disagreement in annotations (Gleason pattern 3 versus 4) by the model and the two pathologists. A third uropathologist indepentently evaluated this case and his opinion coincided with the model’s annotations. **(d)** Disagreement in annotations (Gleason pattern 4 versus 5) by the model and the two pathologists. A third uropathologist indepentently evaluated this case and assigned a Gleason pattern 4, noting however the presence of diffuse single cells which could be interpreted as Gleason pattern 5.

**Figure 4**
Model evaluation on test cohort (TMA spot level) and inter-pathologist variability. Each TMA spot is annotated with detected Gleason patterns (Gleason 3, 4 or 5) by the model and two pathologists. Then, a final Gleason score is assigned as the sum of the two most predominant Gleason patterns. If no cancer is detected, the TMA spot is classified as benign. We show confusion matrices for the comparison of Gleason score assignments by **(a)** the model and the first pathologist, **(b)** the model and the second pathologist, **(c)** the two pathologists.

**Figure 5**
Model interpretation via class activation mapping (CAM). For each class, we show two examples of image patches that were confidently and correctly classified by the deep learning model. In addition, the regions where the model is focusing on in order to make predictions are highlighted. In each example, the first column shows the image patch. In the second column, a heatmap generated by the class activation mapping technique is overlaid, highlighting the most important regions for the model predictions. In the third column, only the highlighted part of the image is shown. Class activation maps are generated by projecting the class-specific weights of the output classification layer back to the feature maps of the last convolutional layer, thus highlighting important regions for predicting a particular class. The final CAM heatmap is computed as the sum of the resulting augmented feature maps, followed by clipping negative values and subsequent scaling to the [0, 1] interval. Red color indicates regions where the CAM heatmap values are close to 1, i.e. the most class-specific discriminative parts of the image.

**Figure 6**
Disease-specific survival analysis results. **(a)** Kaplan-Meier curves for patients who were split into three risk groups according to Gleason score annotations by the model and two pathologists. The shaded regions indicate 95% confidence bands. P-values for pairwise two-tailed logrank tests with Benjamini-Hochberg correction are reported. (b) Venn diagrams illustrating overlap in model-based and pathologist annotation-based assignment of patients into Gleason score groups.

See this image and copyright information in PMC

References

1. WHO Classification of Tumours of the Urinary System and Male Genital Organs. International Agency for Research on Cancer (IARC) (2016).
1. Gleason, D. F. & Mellinger, G. T. Prediction of prognosis for prostatic adenocarcinoma by combined histological grading and clinical staging. J. Urol. 111, 58–64 (1974). - PubMed
1. Faraj SF, et al. Clinical Validation of the 2005 ISUP Gleason Grading System in a Cohort of Intermediate and High Risk Men Undergoing Radical Prostatectomy. PLoS One. 2016;11:e0146189. doi: 10.1371/journal.pone.0146189. - DOI - PMC - PubMed
1. Gordetsky J, Epstein J. Grading of prostatic adenocarcinoma: current state and prognostic implications. Diagn. Pathol. 2016;11:25. doi: 10.1186/s13000-016-0478-2. - DOI - PMC - PubMed
1. Epstein JI. Prostate cancer grading: a decade after the 2005 modified system. Mod. Pathol. 2018;31:S47–63. doi: 10.1038/modpathol.2017.133. - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated Gleason grading of prostate cancer tissue microarrays via deep learning

Affiliations

Automated Gleason grading of prostate cancer tissue microarrays via deep learning

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical