Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 16;8(1):524.
doi: 10.1038/s41746-025-01930-6.

Integrative machine learning models predict prostate cancer diagnosis and biochemical recurrence risk: Advancing precision oncology

Affiliations

Integrative machine learning models predict prostate cancer diagnosis and biochemical recurrence risk: Advancing precision oncology

Yaxuan Wang et al. NPJ Digit Med. .

Abstract

Prostate cancer (PCa) ranks among the most prevalent cancers in men worldwide. Biochemical recurrence (BCR) presents a major clinical challenge in PCa management, with significant prognostic heterogeneity observed among patients post-recurrence. This study aimed to develop machine learning models for predicting both the diagnosis and prognosis of PCa patients. Using WGCNA, we initially identified 16 BCR-related target genes. Cluster analysis revealed these genes were significantly associated with PCa prognosis, drug sensitivity, and immune infiltration. We constructed a robust diagnostic model integrating multiple machine learning algorithms, demonstrating strong predictive capability for PCa. Furthermore, a BCR-related prognostic model built using the LASSO algorithm also yielded satisfactory performance. Among the differentially expressed BCR-associated prognostic genes, COMP emerged as a critical regulatory factor. Both in vitro and in vivo experiments confirmed COMP's role in influencing PCa progression. Additionally, COMP demonstrates significant potential as a dual biomarker for both the diagnosis and recurrence prediction of PCa.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Identification of 16 BCR-associated genes based on WGCNA.
A Plot scale independence. B Plot average connectivity. C Draw sample clusters. D Dividing the GSE116918 sample into 11 modules. E Module-phenotype correlation heat map. F Scatter plot of correlation between gene significance and module membership. G Heatmap of BCR-related gene expression in TCGA-PRAD. H Prognostic forest plot of BCR-related genes in TCGA-PRAD. I Heatmap of correlation of biochemical relapse-related genes in TCGA-PRAD. J Differential expression of BCR-related genes in BCR and non-BCR samples in GSE116918.
Fig. 2
Fig. 2. BCR-related genes play an important role in PRAD.
A KEGG and GO analysis of BCR-related genes in PRAD. B GSCA database-based analysis of BCR-associated gene function in PRAD. C Analysis of the correlation between BCR related genes and the frequency of copy number variants. DF Differential expression of BCR-related genes in different pathological stages of TCGA-PRAD. ***p < 0.001.
Fig. 3
Fig. 3. BCR-related genes are significantly associated with prognostic and pathologic parameters in PRAD patients.
A, B Consensus map of NMF clustering. C Survival differences between clusters. D Differences in the expression of BCR related genes between different clusters. EH Differences in the distribution of different subgroups in different pathological stages of PRAD. ****p < 0.0001.
Fig. 4
Fig. 4. BCR-related genes are strongly associated with immune infiltration in PRAD.
A, B Analysis of BCR-related genes correlating with PRAD immune infiltration. C Heat map of different immune cell infiltration levels. D Analysis of biochemical relapse-related genes and chemotherapy sensitivity in PRAD patients. E Gene enrichment analysis of two clusters. ns på 0.05, *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.
Fig. 5
Fig. 5. The LASSO + LDA algorithm combination is considered to be the best combination for constructing diagnostic models.
A An examination of how biologic recurrence-related genes predict diagnosis in PCa patients. B AUC values for diagnostic models created using various algorithm combinations. C The count of genes included in diagnostic models that were formed with different algorithm combinations.
Fig. 6
Fig. 6. Constructing BCR-related prognostic models.
A Screening of prognosis-related genes by univariate Cox analysis. B Genes were screened through lasso/stepwise regression, and a model was constructed using multivariate Cox analysis. C ROC curve analysis of the model’s predictive ability for BCR-related prognosis in the entire dataset. D The Kaplan-Meier survival curves for the high-risk and low-risk groups in the entire dataset. E Risk factor plot of the test dataset. F Risk factor plot of the train dataset. G ROC curve analysis of the model’s predictive ability for BCR-related prognosis in patients within the test dataset. H The Kaplan-Meier survival curves for the high-risk and low-risk groups in the test dataset. I ROC curve analysis of the model’s predictive ability for BCR-related prognosis in patients within the train dataset. J The Kaplan-Meier survival curves for the high-risk and low-risk groups in the train dataset.
Fig. 7
Fig. 7. COMP is identified as a key regulatory gene for BCR.
A, B Xgboost algorithm identifies key regulatory genes for BCR. C Friend analysis of dominant genes in BCR-regulated genes. D, E Correlation analysis of COMP with the level of immune cell infiltration. F Molecular docking of COMP with drugs. G, H Functional analysis of COMP in PRAD. I Correlation Analysis Between COMP and Immunotherapy. *p < 0.05; **p < 0.01; ***p < 0.001.
Fig. 8
Fig. 8. COMP is highly expressed in PRAD.
AC The expression of COMP in normal prostate tissue. DF Expression of COMP in non-recurrence PRAD tissues. GI Expression of COMP in recurrence PRAD tissues. J Differences in COMP expression between normal prostate tissue and PRAD. K The predictive value of COMP in the diagnosis of PRAD patients. L The expression difference of COMP in recurrence and non-recurrence PRAD samples. M The predictive value of COMP for recurrence in PRAD patients. *p < 0.05; ***p < 0.001.
Fig. 9
Fig. 9. Knockdown of COMP can inhibit the proliferation and metastasis of PCa.
A Verification of COMP knockdown efficiency in PCa cells. B, C Colony formation assay used to evaluate the effect of COMP on PCa cell proliferation. D, E Transwell assays to assess the effect of COMP on the metastatic ability of PCa cells. F Tumor modeling in male BALB/c nude mice. G, H Subcutaneous tumor volume change curve. I Final subcutaneous tumor weight. J Final subcutaneous tumor size. K, L IHC staining images of COMP and Ki-67 in subcutaneous tumors. M, N Measurement of lung metastasis luciferase activity via in vivo imaging system. **p < 0.01; ***p < 0.001.

Similar articles

References

    1. James, N. D. et al. The Lancet Commission on prostate cancer: planning for the surge in cases. Lancet403, 1683–1722 (2024). - PMC - PubMed
    1. Zhong, J. et al. Combining MRI radiomics, hypoxia gene signature score and clinical variables for prediction of biochemical recurrence-free survival after radiotherapy in prostate cancer. Radiol. Med. (2025). - PubMed
    1. Cornford, P. et al. EAU-EANM-ESTRO-ESUR-ISUP-SIOG guidelines on prostate cancer-2024 update. Part I: screening, diagnosis, and local treatment with curative intent. Eur. Urol.86, 148–163 (2024). - PubMed
    1. Van den Broeck, T. et al. Prognostic value of biochemical recurrence following treatment with curative intent for prostate cancer: a systematic review. Eur. Urol.75, 967–987 (2019). - PubMed
    1. Basourakos, S. P. et al. Understanding the impact of salvage radiation on the long-term natural history of biochemically recurrent prostate cancer after radical prostatectomy. Cancer Med.14, e70988 (2025). - PMC - PubMed

LinkOut - more resources