Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 14;4(1):99.
doi: 10.1038/s41746-021-00469-6.

Yet Another Automated Gleason Grading System (YAAGGS) by weakly supervised deep learning

Affiliations

Yet Another Automated Gleason Grading System (YAAGGS) by weakly supervised deep learning

Yechan Mun et al. NPJ Digit Med. .

Abstract

The Gleason score contributes significantly in predicting prostate cancer outcomes and selecting the appropriate treatment option, which is affected by well-known inter-observer variations. We present a novel deep learning-based automated Gleason grading system that does not require extensive region-level manual annotations by experts and/or complex algorithms for the automatic generation of region-level annotations. A total of 6664 and 936 prostate needle biopsy single-core slides (689 and 99 cases) from two institutions were used for system discovery and validation, respectively. Pathological diagnoses were converted into grade groups and used as the reference standard. The grade group prediction accuracy of the system was 77.5% (95% confidence interval (CI): 72.3-82.7%), the Cohen's kappa score (κ) was 0.650 (95% CI: 0.570-0.730), and the quadratic-weighted kappa score (κquad) was 0.897 (95% CI: 0.815-0.979). When trained on 621 cases from one institution and validated on 167 cases from the other institution, the system's accuracy reached 67.4% (95% CI: 63.2-71.6%), κ 0.553 (95% CI: 0.495-0.610), and the κquad 0.880 (95% CI: 0.822-0.938). In order to evaluate the impact of the proposed method, performance comparison with several baseline methods was also performed. While limited by case volume and a few more factors, the results of this study can contribute to the potential development of an artificial intelligence system to diagnose other cancers without extensive region-level annotations.

PubMed Disclaimer

Conflict of interest statement

H.C., I.P. and Y.M. are employees of Deep Bio Inc. T.-Y.K. is the chief technology officer of Deep Bio Inc. S.-J.S. declares no conflict of interest.

Figures

Fig. 1
Fig. 1. Slide-level confusion matrices between the proposed model and the reference standard in grade group prediction in the holistic setting.
a normalized, b original; in the inter-institutional setting: c normalized, d original; in the external validation setting: e normalized, f original.
Fig. 2
Fig. 2. Sample patch images from failure cases (Hematoxylin-eosin stain, ×200).
False-negative cases showed small-sized cancers, which consisted of only several cancer glands, or cancer glands located on the outer sample margin of the WSI. a was diagnosed as grade group 1 and b as grade group 4 in the reference standard, respectively. False-positive cases often exhibited diffuse infiltration of lymphocytes and atrophic glands. The prediction of the model was grade group 4 for (c) and grade group 1 (d), respectively.
Fig. 3
Fig. 3. t-SNE data visualization of the feature vectors of the Gleason pattern 3/4/5 image patches embedded by the first stage model for perplexity 50 and 1000 iterations.
Bigger dots correspond to the mean feature vectors.
Fig. 4
Fig. 4. Model training process of YAAGGS.
The model classifies the input WSIs into Gleason grade groups in two stages. In the first stage of feature extraction, patch images of size 360 × 360 pixels covering the entire slide area are extracted from the input WSI at 10× magnification and fed into the first stage CNN model to extract 1024-dimensional feature vectors. The extracted feature vectors were aligned according to the locations of corresponding patch images to be assembled into a 1024-channel two-dimensional feature map. The second stage CNN model accepts the feature maps as input and classifies them into one of six categories: benign, grade group 1, grade group 2, grade group 3, grade group 4, and grade group 5.

Similar articles

Cited by

References

    1. Prostate Cancer—Cancer Stat Facts. https://seer.cancer.gov/statfacts/html/prost.html. Accessed 6 Apr 2020.
    1. Buhmeida A, Pyrhönen S, Laato M, Collan Y. Prognostic factors in prostate cancer. Diagn. Pathol. 2006;1:4. doi: 10.1186/1746-1596-1-4. - DOI - PMC - PubMed
    1. National Comprehensive Cancer Network (NCCN). Practice Guidelines in Oncology: Prostate Cancer Early Detection. Version 2. 2019. https://www.nccn.org/professionals/physician_gls/pdf/prostate_detection.pdf Accessed 12 Mar 2020 (2019).
    1. Gordetsky J, Epstein J. Grading of prostatic adenocarcinoma: current state and prognostic implications. Diagn. Pathol. 2016;11:25. doi: 10.1186/s13000-016-0478-2. - DOI - PMC - PubMed
    1. Gleason DF. Classification of prostatic carcinomas. Cancer Chemother. Rep. 1966;50:125–8. - PubMed