Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2021 May 16;21(1):80.
doi: 10.1186/s12894-021-00849-w.

Development and head-to-head comparison of machine-learning models to identify patients requiring prostate biopsy

Affiliations
Comparative Study

Development and head-to-head comparison of machine-learning models to identify patients requiring prostate biopsy

Shuanbao Yu et al. BMC Urol. .

Abstract

Background: Machine learning has many attractive theoretic properties, specifically, the ability to handle non predefined relations. Additionally, studies have validated the clinical utility of mpMRI for the detection and localization of CSPCa (Gleason score ≥ 3 + 4). In this study, we sought to develop and compare machine-learning models incorporating mpMRI parameters with traditional logistic regression analysis for prediction of PCa (Gleason score ≥ 3 + 3) and CSPCa on initial biopsy.

Methods: A total of 688 patients with no prior prostate cancer diagnosis and tPSA ≤ 50 ng/ml, who underwent mpMRI and prostate biopsy were included between 2016 and 2020. We used four supervised machine-learning algorithms in a hypothesis-free manner to build models to predict PCa and CSPCa. The machine-learning models were compared to the logistic regression analysis using AUC, calibration plot, and decision curve analysis.

Results: The artificial neural network (ANN), support vector machine (SVM), and random forest (RF) yielded similar diagnostic accuracy with logistic regression, while classification and regression tree (CART, AUC = 0.834 and 0.867) had significantly lower diagnostic accuracy than logistic regression (AUC = 0.894 and 0.917) in prediction of PCa and CSPCa (all P < 0.05). However, the CART illustrated best calibration for PCa (SSR = 0.027) and CSPCa (SSR = 0.033). The ANN, SVM, RF, and LR for PCa had higher net benefit than CART across the threshold probabilities above 5%, and the five models for CSPCa displayed similar net benefit across the threshold probabilities below 40%. The RF (53% and 57%, respectively) and SVM (52% and 55%, respectively) for PCa and CSPCa spared more unnecessary biopsies than logistic regression (35% and 47%, respectively) at 95% sensitivity for detection of CSPCa.

Conclusion: Machine-learning models (SVM and RF) yielded similar diagnostic accuracy and net benefit, while spared more biopsies at 95% sensitivity for detection of CSPCa, compared with logistic regression. However, no method achieved desired performance. All methods should continue to be explored and used in complementary ways.

Keywords: Machine learning; Predictive model; Prostate biopsy; Prostate cancer.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Receive operating characteristic (ROC) curves of machine-learning and logistic regression models for predicting prostate cancer (PCa) and clinically significant prostate cancer (CSPCa) in the validation cohort. a PCa: Gleason score ≥ 3 + 3; b CSPCa: Gleason score ≥ 3 + 4. Abbreviations ANN artificial neural network, SVM support vector machine, CART classification and regression tree, RF random forest, LR logistic regression
Fig. 2
Fig. 2
Calibration plot of observed vs predicted rick of prostate cancer (PCa) and clinically significant prostate cancer (CSPCa) using machine-learning and logistic regression models in the validation cohort. a: PCa: Gleason score ≥ 3 + 3; b CSPCa: Gleason score ≥ 3 + 4. Abbreviations ANN artificial neural network, SVM support vector machine, CART classification and regression tree, RF random forest, LR logistic regression
Fig. 3
Fig. 3
Decision curve analysis (DCA) of machine-learning and logistic regression models for predicting prostate cancer (PCa) and clinically significant prostate cancer (CSPCa) in the validation cohort. a PCa: Gleason score ≥ 3 + 3; b CSPCa: Gleason score ≥ 3 + 4. Abbreviations ANN artificial neural network, SVM support vector machine, CART classification and regression tree, RF random forest, LR logistic regression

Similar articles

Cited by

References

    1. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136(5):E359–E386. doi: 10.1002/ijc.29210. - DOI - PubMed
    1. International Agency for Research on Cancer: GLOBAL CANCER OBSERVATORY. 2018. http://gco.iarc.fr/. Cited 15 July 2020.
    1. Van Neste L, Hendriks RJ, Dijkstra S, Trooskens G, Cornel EB, Jannink SA, et al. Detection of high-grade prostate cancer using a urinary molecular biomarker-based risk score. Eur Urol. 2016;70(5):740–748. doi: 10.1016/j.eururo.2016.04.012. - DOI - PubMed
    1. Bratan F, Niaf E, Melodelima C, Chesnais AL, Souchon R, Mege-Lechevallier F, et al. Influence of imaging and histological factors on prostate cancer detection and localisation on multiparametric MRI: a prospective study. Eur Radiol. 2013;23(7):2019–2029. doi: 10.1007/s00330-013-2795-0. - DOI - PubMed
    1. Le JD, Tan N, Shkolyar E, Lu DY, Kwan L, Marks LS, et al. Multifocality and prostate cancer detection by multiparametric magnetic resonance imaging: correlation with whole-mount histopathology. Eur Urol. 2015;67(3):569–576. doi: 10.1016/j.eururo.2014.08.079. - DOI - PubMed

Publication types

LinkOut - more resources