Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 18:7:46450.
doi: 10.1038/srep46450.

Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent

Affiliations

Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent

Angel Cruz-Roa et al. Sci Rep. .

Abstract

With the increasing ability to routinely and rapidly digitize whole slide images with slide scanners, there has been interest in developing computerized image analysis algorithms for automated detection of disease extent from digital pathology images. The manual identification of presence and extent of breast cancer by a pathologist is critical for patient management for tumor staging and assessing treatment response. However, this process is tedious and subject to inter- and intra-reader variability. For computerized methods to be useful as decision support tools, they need to be resilient to data acquired from different sources, different staining and cutting protocols and different scanners. The objective of this study was to evaluate the accuracy and robustness of a deep learning-based method to automatically identify the extent of invasive tumor on digitized images. Here, we present a new method that employs a convolutional neural network for detecting presence of invasive tumor on whole slide images. Our approach involves training the classifier on nearly 400 exemplars from multiple different sites, and scanners, and then independently validating on almost 200 cases from The Cancer Genome Atlas. Our approach yielded a Dice coefficient of 75.86%, a positive predictive value of 71.62% and a negative predictive value of 96.77% in terms of pixel-by-pixel evaluation compared to manually annotated regions of invasive ductal carcinoma.

PubMed Disclaimer

Conflict of interest statement

Drs Madabhushi, Feldman, Ganesan, and Tomaszewski are scientific consultants for the digital pathology company Inspirata Inc. Drs Madabhushi, Feldman, Ganesan, and Tomaszewski also serve on the scientific advisory board for the digital pathology company Inspirata Inc. Dr. Madabhushi also has an equity stake in Inspirata Inc. and Elucid Bioimaging Inc.

Figures

Figure 1
Figure 1
(AC) Example whole-slide images from test TCGA data cohort with ground truth annotations from pathologists, (DF) the corresponding region predictions produced by the ConvNet classifier and (GI) region predictions for whole-slide images from the test NC data cohort of normal breast tissue without cancer.
Figure 2
Figure 2. Example results for the ConvNetHUP classifier on the CINJ validation data cohort.
The probability map predicted by the ConvNetHUP classifier (second row, (DF)) was compared against ground truth annotations by a pathologist (first row (AC)). The third row shows the evaluation results of the ConvNetHUP classifier in terms of TP (green), FN (red), FP (yellow), and TN (blue) regions.
Figure 3
Figure 3. Whole-slide image from CINJ validation data cohort diagnosed with a rare type of IDC: mucinous carcinoma of the breast.
(A) The comparison between the ground truth annotations and the predictions from the ConvNetHUP classifier reveal both FN (red) and FP (yellow) errors. (B,C) Most of the FN regions, i.e. tissues wrongly labeled as non-invasive tumor, correspond to mucinous carcinoma, whilst (D) most of FP regions, i.e. tissues wrongly predicted as invasive tumor, are actually invasive mucinous carcinoma that was not included in the annotations by the pathologist.
Figure 4
Figure 4
The most challenging whole-slide image in the CINJ validation cohort achieved the poorest performance via the ConvNetHUP classifier with (A) many FP regions and a Dice coefficient of 0.0745. (B) Some of the FN errors are due to the confounding morphologic attributes of the tumor, arising due to a mixing of IDC with fat cells and irregular, infiltrating looking cribriform glands with DCIS. The FP regions appear to be primarily be due to (C) sclerosing adenosis, and (D) DCIS surrounded by IDC.
Figure 5
Figure 5. Agreement plot of the Dice coefficient for the ConvNetHUP (X-axis) and ConvNetUHCMC/CWRU (Y-axis) classifiers for each slide (blue circles) in the TCGA cohort.
The slides with higher disagreement are identified with red circles (see Fig. 6).
Figure 6
Figure 6
(AC) Slides from the TCGA cohort which revealed disagreement between the predictions of the ConvNetHUP and ConvNetUHCMC/CWRU classifiers. The predictions of the (DF) ConvNetHUP and (GI) ConvNetUHCMC/CWRU classifiers were compared against the ground truth annotations in terms of TP (green), FN (red), FP (yellow) and TN (blue) regions.
Figure 7
Figure 7
(AC) Example whole-slide images from the TCGA data cohort with corresponding ground truth annotations. The probability maps generated by the ConvNetUHCMC/CWRU and ConvNetHUP classifiers are shown in panels (DF,GI) respectively.
Figure 8
Figure 8
The probability maps obtained via the ConvNetUHCMC/CWRU and ConvNetHUP classifiers on whole-slide images of normal breast sections from the UHCMC/CWRU and NC data cohorts are shown in panels (AC,DF) respectively.
Figure 9
Figure 9. Dice coefficient between pathologist annotations for the CINJ data cohort (N = 40).
Figure 10
Figure 10. Overview of the process of training and testing of the deep learning classifiers for invasive breast cancer detection on whole-slide images.
The training data set had 349 ER+ invasive breast cancer patients (HUP N = 239, UHCMC/CWRU N = 110). The validation data set contained 40 ER+ invasive breast cancer patients from the Cancer Institute of New Jersey (CINJ). The test data set was composed of 195 ER+ invasive breast cancer cases from TCGA and 21 negative controls (NC).
Figure 11
Figure 11. 3-layer ConvNet architecture.

References

    1. Genestie C. et al.. Comparison of the prognostic value of Scarff-Bloom-Richardson and nottingham histological grades in a series of 825 cases of breast cancer: major importance of the mitotic count as a component of both grading systems. Anticancer Research 18, 571–576 (1998). - PubMed
    1. Elston C. W. & Ellis I. O. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology 19, 403–410 (1991). - PubMed
    1. Frierson H. F. et al.. Interobserver reproducibility of the Nottingham modification of the Bloom and Richardson histologic grading scheme for infiltrating ductal carcinoma. American journal of clinical pathology 103, 195–8 (1995). - PubMed
    1. Gomes D. S., Porto S. S., Balabram D. & Gobbi H. Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast. Diagnostic pathology 9, 121 (2014). - PMC - PubMed
    1. Longacre T. A. et al.. Interobserver agreement and reproducibility in classification of invasive breast carcinoma: an NCI breast cancer family registry study. Modern pathology: an official journal of the United States and Canadian Academy of Pathology, Inc 19, 195–207 (2006). - PubMed

Publication types