Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Sep 26;12(10):1490.
doi: 10.3390/life12101490.

Comparative Performance of Deep Learning and Radiologists for the Diagnosis and Localization of Clinically Significant Prostate Cancer at MRI: A Systematic Review

Affiliations
Review

Comparative Performance of Deep Learning and Radiologists for the Diagnosis and Localization of Clinically Significant Prostate Cancer at MRI: A Systematic Review

Christian Roest et al. Life (Basel). .

Abstract

Background: Deep learning (DL)-based models have demonstrated an ability to automatically diagnose clinically significant prostate cancer (PCa) on MRI scans and are regularly reported to approach expert performance. The aim of this work was to systematically review the literature comparing deep learning (DL) systems to radiologists in order to evaluate the comparative performance of current state-of-the-art deep learning models and radiologists.

Methods: This systematic review was conducted in accordance with the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Studies investigating DL models for diagnosing clinically significant (cs) PCa on MRI were included. The quality and risk of bias of each study were assessed using the checklist for AI in medical imaging (CLAIM) and QUADAS-2, respectively. Patient level and lesion-based diagnostic performance were separately evaluated by comparing the sensitivity achieved by DL and radiologists at an identical specificity and the false positives per patient, respectively.

Results: The final selection consisted of eight studies with a combined 7337 patients. The median study quality with CLAIM was 74.1% (IQR: 70.6-77.6). DL achieved an identical patient-level performance to the radiologists for PI-RADS ≥ 3 (both 97.7%, SD = 2.1%). DL had a lower sensitivity for PI-RADS ≥ 4 (84.2% vs. 88.8%, p = 0.43). The sensitivity of DL for lesion localization was also between 2% and 12.5% lower than that of the radiologists.

Conclusions: DL models for the diagnosis of csPCa on MRI appear to approach the performance of experts but currently have a lower sensitivity compared to experienced radiologists. There is a need for studies with larger datasets and for validation on external data.

Keywords: deep learning; magnetic resonance imaging; prostatic neoplasms.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure A1
Figure A1
Results of the bias assessment using the QUADAS-2 tool [21,22,23,24,25,26,27,28].
Figure 1
Figure 1
PRISMA 2020 Flow diagram.
Figure 2
Figure 2
An overview of the quality of the included studies evaluated using the CLAIM checklist. The scores reflect the percentage of applicable checklist items that were sufficiently reported by each of the studies. All scores were agreed upon by two reviewers [21,22,23,24,25,26,27,28].
Figure 3
Figure 3
Average accordance with the CLAIM checklist by checklist item. Checklist items that were considered not applicable to the study were omitted from the calculation.
Figure 4
Figure 4
Sensitivity and specificity of DL systems for the diagnosis of csPCa at the patient level, compared to the respective radiologist benchmarks [22,23,24,27,28]. (*) The radiologist’s performance in Hiremath et al. [22] was estimated from ROC curves at specificity estimates derived from the literature [4].

References

    1. Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA A Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. - DOI - PubMed
    1. Egevad L., Granfors T., Karlberg L., Bergh A., Stattin P. Prognostic Value of the Gleason Score in Prostate Cancer. BJU Int. 2002;89:538–542. doi: 10.1046/j.1464-410X.2002.02669.x. - DOI - PubMed
    1. Weinreb C.J., Barentsz J.O., Choyke P.L., Cornud F., Haider M.A., Macura K.J., Margolis D., Shtern F., Tempany C.M., Thoeny H.C., et al. PI-RADS Prostate Imaging—Reporting and Data System: 2015, Version 2. Eur. Urol. 2016;69:16–40. doi: 10.1016/j.eururo.2015.08.052. - DOI - PMC - PubMed
    1. Daun M., Fardin S., Ushinsky A., Batra S., Nguyentat M., Lee T., Uchio E., Lall C., Houshyar R. PI-RADS Version 2 Is an Excellent Screening Tool for Clinically Significant Prostate Cancer as Designated by the Validated International Society of Urological Pathology Criteria: A Retrospective Analysis. Curr. Probl. Diagn. Radiol. 2020;49:407–411. doi: 10.1067/j.cpradiol.2019.06.010. - DOI - PubMed
    1. Smith C.P., Harmon S.A., Barrett T., Bittencourt L.K., Law Y.M., Shebel H., An J.Y., Czarniecki M., Mehralivand S., Coskun M., et al. Intra-and Interreader Reproducibility of PI-RADSv2: A Multireader Study. J. Magn. Reson. Imaging. 2019;49:1694–1703. doi: 10.1002/jmri.26555. - DOI - PMC - PubMed