Semisupervised Learning with Report-guided Pseudo Labels for Deep Learning-based Prostate Cancer Detection Using Biparametric MRI
- PMID: 37795142
- PMCID: PMC10546362
- DOI: 10.1148/ryai.230031
Semisupervised Learning with Report-guided Pseudo Labels for Deep Learning-based Prostate Cancer Detection Using Biparametric MRI
Abstract
Purpose: To evaluate a novel method of semisupervised learning (SSL) guided by automated sparse information from diagnostic reports to leverage additional data for deep learning-based malignancy detection in patients with clinically significant prostate cancer.
Materials and methods: This retrospective study included 7756 prostate MRI examinations (6380 patients) performed between January 2014 and December 2020 for model development. An SSL method, report-guided SSL (RG-SSL), was developed for detection of clinically significant prostate cancer using biparametric MRI. RG-SSL, supervised learning (SL), and state-of-the-art SSL methods were trained using 100, 300, 1000, or 3050 manually annotated examinations. Performance on detection of clinically significant prostate cancer by RG-SSL, SL, and SSL was compared on 300 unseen examinations from an external center with a histopathologically confirmed reference standard. Performance was evaluated using receiver operating characteristic (ROC) and free-response ROC analysis. P values for performance differences were generated with a permutation test.
Results: At 100 manually annotated examinations, mean examination-based diagnostic area under the ROC curve (AUC) values for RG-SSL, SL, and the best SSL were 0.86 ± 0.01 (SD), 0.78 ± 0.03, and 0.81 ± 0.02, respectively. Lesion-based detection partial AUCs were 0.62 ± 0.02, 0.44 ± 0.04, and 0.48 ± 0.09, respectively. Examination-based performance of SL with 3050 examinations was matched by RG-SSL with 169 manually annotated examinations, thus requiring 14 times fewer annotations. Lesion-based performance was matched with 431 manually annotated examinations, requiring six times fewer annotations.
Conclusion: RG-SSL outperformed SSL in clinically significant prostate cancer detection and achieved performance similar to SL even at very low annotation budgets.Keywords: Annotation Efficiency, Computer-aided Detection and Diagnosis, MRI, Prostate Cancer, Semisupervised Deep Learning Supplemental material is available for this article. Published under a CC BY 4.0 license.
© 2023 by the Radiological Society of North America, Inc.
Conflict of interest statement
Disclosures of conflicts of interest: J.S.B. Health~Holland grant: LSHM20103 European Union H2020 grants: ProCAncer-I project (952159), PANCAIM project (101016851), Siemens Healthineers grant: CID: C00225450. A.S. Health~Holland grant: LSHM20103 European Union H2020 grants: ProCAncer-I project (952159), PANCAIM project (101016851), Siemens Healthineers grant: CID: C00225450. M.H. No relevant relationships. I.S. Health~Holland grant: LSHM20103 European Union H2020 grants: ProCAncer-I project (952159), PANCAIM project (101016851), Siemens Healthineers grant: CID: C00225450. M.d.R. No relevant relationships. H.H. Partial grant Siemens Healthineers in combination with LSH Dutch government funding (institution).
Figures



![Quality of the pseudo labels, as evaluated by free-response receiver
operating characteristic (FROC) analysis for matching manually annotated
Prostate Imaging and Reporting Data System (PI-RADS) 4 or greater lesions in
the manually labeled development dataset. Supervised models used to generate
report-guided pseudo labels were trained with fivefold cross-validation on
the manually labeled development dataset. Uncertainty-aware mean teacher and
cross pseudo supervision models were trained with fivefold cross-validation
on the development dataset. Filtering pseudo labels using the number of
clinically significant findings described in the diagnostic report (nsig)
greatly reduced the number of false-positive lesions per examination
(report-guided pseudo labels [intermediate]). Excluding examinations with
fewer than nsig lesion candidates improved sensitivity (report-guided pseudo
labels). Shaded areas indicate 95% CIs. Error bars indicate SDs.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea9/10546362/d9bb8acc6d38/ryai.230031.fig3.gif)

References
-
- Ardila D , Kiraly AP , Bharadwaj S , et al. . End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography . Nat Med 2019. ; 25 ( 6 ): 954 – 961 . [Published correction appears in Nat Med 2019;25(8):1319.] - PubMed
-
- McKinney SM , Sieniek M , Godbole V , et al. . International evaluation of an AI system for breast cancer screening . Nature 2020. ; 577 ( 7788 ): 89 – 94 . [Published correction appears in Nature 2020;586(7829):E19.] - PubMed
-
- Liu Y , Jain A , Eng C , et al. . A deep learning system for differential diagnosis of skin diseases . Nat Med 2020. ; 26 ( 6 ): 900 – 908 . - PubMed
-
- Mahajan D , Girshick R , Ramanathan V , et al. . Exploring the limits of weakly supervised pretraining . In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science ; vol. 11206 . Cham, Switzerland: : Springer International Publishing; , 2018. ; 185 – 201 . https://link.springer.com/10.1007/978-3-030-01216-8_12. Accessed June 5, 2023 . - DOI
-
- Xie Q , Luong MT , Hovy E , Le QV . Self-training with noisy student improves ImageNet classification . In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , Seattle, WA: . IEEE; , 2020. ; 10684 – 10695 . https://ieeexplore.ieee.org/document/9156610/. Accessed June 5, 2023 .
LinkOut - more resources
Full Text Sources