Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug;49(8):5160-5181.
doi: 10.1002/mp.15777. Epub 2022 Jun 13.

Bridging the gap between prostate radiology and pathology through machine learning

Affiliations

Bridging the gap between prostate radiology and pathology through machine learning

Indrani Bhattacharya et al. Med Phys. 2022 Aug.

Abstract

Background: Prostate cancer remains the second deadliest cancer for American men despite clinical advancements. Currently, magnetic resonance imaging (MRI) is considered the most sensitive non-invasive imaging modality that enables visualization, detection, and localization of prostate cancer, and is increasingly used to guide targeted biopsies for prostate cancer diagnosis. However, its utility remains limited due to high rates of false positives and false negatives as well as low inter-reader agreements.

Purpose: Machine learning methods to detect and localize cancer on prostate MRI can help standardize radiologist interpretations. However, existing machine learning methods vary not only in model architecture, but also in the ground truth labeling strategies used for model training. We compare different labeling strategies and the effects they have on the performance of different machine learning models for prostate cancer detection on MRI.

Methods: Four different deep learning models (SPCNet, U-Net, branched U-Net, and DeepLabv3+) were trained to detect prostate cancer on MRI using 75 patients with radical prostatectomy, and evaluated using 40 patients with radical prostatectomy and 275 patients with targeted biopsy. Each deep learning model was trained with four different label types: pathology-confirmed radiologist labels, pathologist labels on whole-mount histopathology images, and lesion-level and pixel-level digital pathologist labels (previously validated deep learning algorithm on histopathology images to predict pixel-level Gleason patterns) on whole-mount histopathology images. The pathologist and digital pathologist labels (collectively referred to as pathology labels) were mapped onto pre-operative MRI using an automated MRI-histopathology registration platform.

Results: Radiologist labels missed cancers (ROC-AUC: 0.75-0.84), had lower lesion volumes (~68% of pathology lesions), and lower Dice overlaps (0.24-0.28) when compared with pathology labels. Consequently, machine learning models trained with radiologist labels also showed inferior performance compared to models trained with pathology labels. Digital pathologist labels showed high concordance with pathologist labels of cancer (lesion ROC-AUC: 0.97-1, lesion Dice: 0.75-0.93). Machine learning models trained with digital pathologist labels had the highest lesion detection rates in the radical prostatectomy cohort (aggressive lesion ROC-AUC: 0.91-0.94), and had generalizable and comparable performance to pathologist label-trained-models in the targeted biopsy cohort (aggressive lesion ROC-AUC: 0.87-0.88), irrespective of the deep learning architecture. Moreover, machine learning models trained with pixel-level digital pathologist labels were able to selectively identify aggressive and indolent cancer components in mixed lesions on MRI, which is not possible with any human-annotated label type.

Conclusions: Machine learning models for prostate MRI interpretation that are trained with digital pathologist labels showed higher or comparable performance with pathologist label-trained models in both radical prostatectomy and targeted biopsy cohort. Digital pathologist labels can reduce challenges associated with human annotations, including labor, time, inter- and intra-reader variability, and can help bridge the gap between prostate radiology and pathology by enabling the training of reliable machine learning models to detect and localize prostate cancer on MRI.

Keywords: aggressive versus indolent cancer; cancer labels; deep learning; digital pathology; prostate MRI.

PubMed Disclaimer

Conflict of interest statement

Mirabela Rusu has research grants from GE Healthcare and Philips Healthcare.

Figures

FIGURE 1
FIGURE 1
Radiologists, pathologists, or digital pathologists are used to create labels on MRI and serve to train deep learning models to detect cancer and aggressive cancer on MRI. The pathology labels (LPath, LLesionDPath, and LPixelDPath) are derived through annotations on whole‐mount histopathology images and are mapped onto MRI through MRI‐histopathology registration. The pixel‐level digital pathologist label (LPixelDPath) enables identifying aggressive and indolent cancer components in mixed lesions, unlike the other label types
FIGURE 2
FIGURE 2
Differences in labeling strategies in a typical patient in cohort C1 test (aggressive cancer—yellow, indolent cancer—green) showed on (a) T2w images and (b) ADC images. The (c) radiologist labels (LRad) and (d) pathologist labels (LPath) are present on some slices while the (e) lesion‐level digital pathologist labels (LLesionDPath), and (f) pixel‐level digital pathologist labels (LPixelDPath) exist on all slices. Digital pathologist labels strongly agree with pathologists while annotating aggressive and indolent cancer components in mixed lesions
FIGURE 3
FIGURE 3
Distribution of lesion volumes of discarded lesions for (a) radiologist (LRad), (b) pathologist (LPath), (c) lesion‐level digital pathologist (LLesionDPath), and pixel‐level digital pathologist (LPixelDPath) labels for cohort C1 test. The red vertical line indicates the threshold lesion volume of 250 mm3. Only three radiologist lesions in C1 test were discarded, whereas, a large number of pathology lesions with predominantly tiny lesion volumes (median discarded lesion volume 0.4–1.3 mm3) were discarded. The y‐axis shows the frequency of distribution in log scale, while the x‐axis shows the lesion volume in mm3
FIGURE 4
FIGURE 4
The difference in resolution between the whole‐mount histopathology and the MR images, and the detailed gland‐level annotations of pathology labels, often result in tiny lesions which are (a) only a few pixels on MRI and clinically insignificant (shown by yellow arrows). Discarding small lesions with volumes <250 mm3 result in (b) cleaner and clinically meaningful lesions for training and evaluation of digital radiologist models. Zooming into these tiny lesions (red box in (a)) on (c) high resolution histopathology and (d) the registered MRI further reveals these are not clinically meaningful to be detected on MRI. While tiny, the lesion shown by the white arrow is not discarded as it gets connected to the lesion visible in the subsequent MRI slices
FIGURE 5
FIGURE 5
All the digital radiologist models (SPCNet, U‐Net, branched U‐Net, and DeepLabv3+) are trained with T2w and ADC images of the prostate as inputs. Each model is trained with one of the four label types as ground truth at a time. The DeepLabv3+ model is trained in a 2D fashion, with a single slice of T2w and ADC image as input (as shown in this figure), while the other models are trained in a 2.5D fashion with three consecutive MRI slices as inputs. Pre‐processing of the T2w and ADC images includes registration, cropping, and resampling around the prostate, and MRI intensity standardization and normalization
FIGURE 6
FIGURE 6
Quantitative comparison between cancer outlines of the different label types. (a) Dice overlap for cancer, (b) lesion‐level ROC‐AUC for cancer, (c) Dice overlap for aggressive cancer, (d) lesion‐level ROC‐AUC for aggressive cancer
FIGURE 7
FIGURE 7
The digital pathologist‐predicted automated aggressive (Gleason pattern 4, green) and indolent (Gleason pattern 3, blue) cancers visually match the manual cancer annotations by the expert pathologist (black, yellow, orange, and red). (a) Whole‐mount histopathology image with (b–d) close‐up into the two cancer lesions. (C) Cancer labels manually outlined by the expert pathologist (black outline) shows high agreement with overall cancer (combined blue and green) predicted by the digital pathologist model. (b, d) It is impractically time consuming for a human pathologist to manually assign pixel‐level Gleason patterns (yellow, orange, and red) to each gland in detail as done by the digital pathologist (blue and green)
FIGURE 8
FIGURE 8
Predictions from SPCNet trained with different label types of a typical patient from cohort C1 test (same as Figure 2) show that only LPixelDPath‐trained SPCNet (f) selectively identified the aggressive and indolent cancer components in the lesion, while all other models detected the lesion as aggressive (SPCNet predictions: aggressive cancer [red], indolent cancer [blue[). (a) T2w images, (b) ADC images, (c) LRad‐trained SPCNet predictions, (d) LPath‐trained SPCNet predictions, (e) LLesionDPath‐trained SPCNet predictions, (f) LPixelDPath‐trained SPCNet predictions
FIGURE 9
FIGURE 9
Labels and SPCNet predictions for three different patients from cohort C1 test (labels: aggressive cancer [yellow], indolent cancer [green]); SPCNet predictions: aggressive cancer [red], indolent cancer [blue]) on (a) T2w and (b) ADC images. The (c) LRad labels and LRad‐trained SPCNet predictions may miss cancers or underestimate cancer extent. The (d) LPath labels and LPath‐trained SPCNet predictions, and the (e) LLesionDPath and LLesionDPath‐trained SPCNet predictions show strong agreement in cancer localization and extent. The (f) LPixelDPath and LPixelDPath‐trained SPCNet predictions can selectively identify and localize the aggressive and indolent cancer components in the mixed lesions unlike any other label or prediction type. The outline for columns with SPCNet predictions correspond to pathologist annotations. Radiologists and pathologists are not required to annotate cancer extent on all slices of a patient for routine clinical care, but knowing the complete extent of cancer on all slices may be essential to train machine learning models. As such, C1‐Pat3 does not show a LPath label while cancer is present
FIGURE 10
FIGURE 10
SPCNet predictions for two different patients from cohort C2 on (a) T2w and (b) ADC images. The (c)LRad‐trained SPCNet predictions miss the cancer in the row 2 patient C2‐Pat2. The (d) LPath‐trained and (e) LLesionDPath‐trained SPCNet predictions detect the lesions in both patients, with the (e) LLesionDPath‐trained predictions having the highest overlap with the cancer extent. The (f) LPixelDPath‐trained SPCNet predictions are slightly off from the LRad labels for the row 2 patient C2‐Pat2. The outlines for columns with SPCNet‐predictions correspond to radiologist labels (LRad)
FIGURE 11
FIGURE 11
Quantitative comparison between digital radiologist (SPCNet) predictions when trained and evaluated using different label types in cohort C1 test. The top row shows results for cancer detection, while the bottom row shows results for aggressive cancer detection. Darker blue boxes in the 4 × 4 matrices represent higher evaluation metrics.

References

    1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2021. CA Cancer J Clin. 2021;71:7‐33. - PubMed
    1. Ahmed HU, Bosaily AE‐S, Brown LC, et al. Diagnostic accuracy of multi‐parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet. 2017;389:815‐822. - PubMed
    1. Bosaily AE‐S, Parker C, Brown LC, et al. PROMIS–prostate MR imaging study: a paired validating cohort study evaluating the role of multi‐parametric MRI in men with clinical suspicion of prostate cancer. Contemp Clin Trials. 2015;42:26‐40. - PMC - PubMed
    1. Johnson DC, Raman SS, Mirak SA, et al. Detection of individual prostate cancer foci via multiparametric magnetic resonance imaging. Eur Urol. 2019;75:712‐720. - PubMed
    1. van der Leest M, Cornel E, Israel B, et al. Head‐to‐head comparison of transrectal ultrasound‐guided prostate biopsy versus multiparametric prostate resonance imaging with subsequent magnetic resonance‐guided biopsy in biopsy‐naïve men with elevated prostate‐specific antigen: a large prospective multicenter clinical study. Eur Urol. 2019;75:570‐578. - PubMed