Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2025 Sep;7(5):e240507.
doi: 10.1148/rycan.240507.

Improving Clinically Significant Prostate Cancer Detection with a Multimodal Machine Learning Approach: A Large-Scale Multicenter Study

Affiliations
Multicenter Study

Improving Clinically Significant Prostate Cancer Detection with a Multimodal Machine Learning Approach: A Large-Scale Multicenter Study

Ana Carolina Rodrigues et al. Radiol Imaging Cancer. 2025 Sep.

Abstract

Purpose To develop and prospectively validate a clinical and radiologic model to predict clinically significant prostate cancer (csPCa) using biparametric MRI (bpMRI). Materials and Methods Retrospective data (acquired before March 31, 2022) from 12 medical centers were collected. Radiomic features were extracted from the whole prostate gland using segmentations generated by an automatic deep learning algorithm. A model incorporating bpMRI radiomics, age, prostate-specific antigens, the Prostate Imaging Reporting and Data System (PI-RADS), and the prostate zone lesion location was trained. A retrospective validation set and prospective data (acquired after March 31, 2022) were used to compare PI-RADS scoring (area under the receiver operating characteristic curve [AUC] and specificity at PI-RADS >3). Sensitivity analyses for sequence (T2-weighted, apparent diffusion coefficient, diffusion-weighted imaging) and scanner vendor (GE, Philips, Siemens) were performed, in addition to fairness analyses for relevant categories. Results The retrospective dataset for model development included 7157 male patients (mean age, 64.78 years; 3342 [46.7%] with csPCa), and the prospective dataset for model validation included 1629 patients (mean age, 66.19 years; 592 [36.3%] with csPCa). The multimodal model outperformed PI-RADS in the retrospective (AUC, 0.88 vs 0.80, P = .005; specificity of 71% vs 58%, P = .002) and prospective validation sets (AUC, 0.91 vs 0.85, P < .001; specificity of 77% vs 66%, P < .001), leading to 22.7% fewer biopsies compared with PI-RADS. Sensitivity analyses showed the importance of multiple sequences and vendors in achieving model generalization, as using specific sequences or vendors alone led to worse performance. Fairness analysis showed generalizability across different categories but highlighted increased sensitivity with higher PI-RADS and reduced performance in one medical center. Conclusion A multimodal model provided a temporally generalizable predictor of csPCa that outperformed PI-RADS. Keywords: Algorithm Development, Machine Learning, Model Validation, Model Training, Genital/Reproductive, Neoplasms-Primary, Oncology, Comparative Studies, Technology Assessment Supplemental material is available for this article. © RSNA, 2025.

Keywords: Algorithm Development; Comparative Studies; Genital/Reproductive; Machine Learning; Model Training; Model Validation; Neoplasms-Primary; Oncology; Technology Assessment.

PubMed Disclaimer

Conflict of interest statement

Disclosures of conflicts of interest: A.C.R. No relevant relationships. J.G.d.A. No relevant relationships. N.R. Grant from SciPROJ. M.R. No relevant relationships. A.S.C.V. No relevant relationships. A.M.G. No relevant relationships. C.B. No relevant relationships. I.S. No relevant relationships. J.I. No relevant relationships. S.B. No relevant relationships. S.S. No relevant relationships. I.D. No relevant relationships. M.T. No relevant relationships. K.M. No relevant relationships. D.R. Grants or contracts from AIRC 5x1000 Colon Cancer. N.P. Stock in MRIcons.

Figures

None
Graphical abstract
Comparison of IVIM maps and parameter histograms from two software
packages showing similar spatial patterns in a breast lesion.
Figure 1:
(A) Flowchart of the inclusion and exclusion criteria of retrospective and prospective biparametric MRI (bpMRI) examinations in ProstateNet. Exclusion criteria were missing sequence in the bpMRI (either T2W [T2-weighted images], DWI [diffusion-weighted imaging], or ADC [apparent diffusion coefficient maps]), file corruption, coregistration failure, and missing clinical or target information. (B) Schematic representation of our modeling and validation protocol. After studies are retrieved from ProstateNet, prostate segmentation masks are automatically generated from T2-weighted imaging and coregistered to DWI and ADC. Radiomic features are extracted from all series. These, together with clinical variables (age, prostate serum antigen [PSA], Prostate Imaging Reporting and Data System [PI-RADS], lesion location), were used to train a radiomics model predicting either clinically significant prostate cancer (csPCa) or no csPCa. Validation was performed against PI-RADS scoring using two datasets—a retrospective validation set and a prospective cohort including 194 external studies (also available in ProstateNet).
Bland-Altman plots showing agreement of mean IVIM parameters between
the two software packages across both sites.
Figure 2:
(A) Receiver operating characteristics curve of our model compared with Prostate Imaging Reporting and Data System (PI-RADS) on both the retrospective validation set (blue) and prospective validation set (red). The dashed line corresponds to the sensitivity of PI-RADS >3. (B) Confusion matrices for the retrospective and prospective cohorts. (C) Comparison of PI-RADS and the clinical plus radiomics models in terms of area under the receiver operating characteristic curve (AUC), sensitivity of PI-RADS >3 specificity, and specificity of PI-RADS >3 sensitivity stratified according to retrospective or prospective cohorts. (D) Shapley additive explanation (SHAP) values for contribution of features for model predictions on the prospective cohort. Each dot in the graph represents the SHAP value of a feature (clinical features) or of a group of features (radiomic features) for one observation. Positive and negative SHAP values correspond to contributions to classification as clinically significant prostate cancer (csPCa) or non-csPCa, respectively. (E) Learning curve for cross-validation (CV), retrospective, and prospective model performance. The shaded green area represents the SD of the CV performance estimate. *P < .05. **P < .01. ***P < .001. AS = anterior stroma, CZ = central zone, PZ = peripheral zone, TZ = transition zone.
ROC curves and AUC values for regression models using IVIM metrics to
differentiate benign from malignant lesions, with and without
cross-validation.
Figure 3:
Cross-validation (CV) and retrospective validation set area under the receiver operating characteristic curve (AUC) sensitivity, specificity, and the AUC for models trained to predict clinical significance (ISUP 2–5), stratified according to sequence type radiomics used during training, vendor, and whether clinical features were used to train the model. ADC = apparent diffusion coefficient, bpMRI = biparametric MRI (combined T2W, ADC and DWI), DWI = diffusion-weighted imaging, ERC = endorectal coil, ISUP = International Society of Urological Pathology, T2W = T2-weighted.
Performance metrics for our model are applied to the prospective
cohort, stratified according to relevant subgroups (age, endorectal coil
[ERC], country, scanner vendor, Prostate Imaging Reporting and Data System
[PI-RADS]). The gold vertical line in the first three columns corresponds to
the performance observed for the whole prospective dataset. Horizontal black
lines in the first three columns correspond to the parametric 95% CIs
(DeLong test for area under the receiver operating characteristic curve
[AUC] and z scores for sensitivity and specificity). Categories with fewer
than 50 instances were excluded. csPCa = clinically significant prostate
cancer, FOV = field of view, PCa = prostate cancer.
Figure 4:
Performance metrics for our model are applied to the prospective cohort, stratified according to relevant subgroups (age, endorectal coil [ERC], country, scanner vendor, Prostate Imaging Reporting and Data System [PI-RADS]). The gold vertical line in the first three columns corresponds to the performance observed for the whole prospective dataset. Horizontal black lines in the first three columns correspond to the parametric 95% CIs (DeLong test for area under the receiver operating characteristic curve [AUC] and z scores for sensitivity and specificity). Categories with fewer than 50 instances were excluded. csPCa = clinically significant prostate cancer, FOV = field of view, PCa = prostate cancer.

References

    1. Sung H , Ferlay J , Siegel RL , et al . Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries . CA Cancer J Clin 2021. ; 71 ( 3 ): 209 – 249 . - PubMed
    1. Siegel RL , Miller KD , Wagle NS , Jemal A . Cancer statistics, 2023 . CA Cancer J Clin 2023. ; 73 ( 1 ): 17 – 48 . - PubMed
    1. Mottet N , van den Bergh RCN , Briers E , et al . EAU-EANM-ESTRO-ESUR-SIOG guidelines on prostate cancer-2020 update. Part 1: screening, diagnosis, and local treatment with curative intent . Eur Urol 2021. ; 79 ( 2 ): 243 – 262 . - PubMed
    1. Albertsen PC . Prostate cancer screening and treatment: where have we come from and where are we going? BJU Int 2020. ; 126 ( 2 ): 218 – 224 . - PubMed
    1. Turkbey B , Rosenkrantz AB , Haider MA , et al . Prostate imaging reporting and data system version 2.1: 2019 update of prostate imaging reporting and data system version 2 . Eur Urol 2019. ; 76 ( 3 ): 340 – 351 . - PubMed

Publication types