Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 19;15(1):17386.
doi: 10.1038/s41598-025-00179-z.

Feasibility of machine learning-based modeling and prediction to assess osteosarcoma outcomes

Affiliations

Feasibility of machine learning-based modeling and prediction to assess osteosarcoma outcomes

Qinfei Zhao et al. Sci Rep. .

Abstract

Osteosarcoma, an aggressive bone malignancy predominantly affecting children and adolescents, is characterized by a poor prognosis and high mortality rates. The development of reliable prognostic tools is critical for advancing personalized treatment strategies. However, identifying robust gene signatures to predict osteosarcoma outcomes remains a significant challenge. In this study, we analyzed gene expression data from 138 osteosarcoma samples across two multicenter cohorts and identified 14 consensus prognosis-associated genes via univariate Cox regression analysis. Using 66 combinations of 10 machine learning (ML) algorithms, we developed a machine learning-derived prognostic signature (MLDPS) optimized by the average C-index across TARGET, GSE21257, and merged cohorts. The MLDPS effectively stratified osteosarcoma patients into high- and low-risk score groups, achieving strong predictive performance for 1-, 3-, and 5-year overall survival (AUC range: 0.852 - 0.963). The MLDPS, comprising seven genes (CTNNBIP1, CORT, DLX2, TERT, BBS4, SLC7A1, NKX2-3), exhibited superior predictive accuracy compared to 10 established gene signatures. The findings of the MLDPS carry significant clinical implications for osteosarcoma treatment. Patients with a high-risk score demonstrated worse prognosis, increased metastasis risk, reduced immune infiltrations, and greater sensitivity to immunotherapy. Conversely, low-risk patients exhibited prolonged survival and distinct drug sensitivities. These findings underscore the potential of MLDPS to guide risk stratification, inform personalized therapeutic strategies, and improve clinical management in osteosarcoma.

Keywords: Machine learning; Osteosarcoma; Prognosis; Risk score; Tumor immunotherapy.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Removal of batch effects and univariate regression analysis in osteosarcoma. (A) Principle-component Analysis (PCA) plot before batch effect removal. (B) PCA plot after batch effect removal. (C) Univariate Cox regression analysis results for 14 CPSGs in the GSE21257 and TARGET cohorts.
Fig. 2
Fig. 2
GO and KEGG pathway analyse of CPSGs. The GO enrichment analysis of 14 CPSGs. The enriched BPs (A), CCs (B), and MFs (C) are noted. (D) The KEGG pathway enrichment analysis of 14 CPSGs.
Fig. 3
Fig. 3
Construction and evaluation of the MLDPS effectiveness. (A) The C-indexes values of 10 well-established ML algorithm combinations across three cohorts. (B) Seven CPSGs selected for the final model using the LASSO algorithm. (C) Time-dependent ROC analysis for predicting OS in the TARGET cohort. (D) AUC values of the time-dependent ROC curves in the TARGET cohort. (E) KM curves for OS in the TARGET cohort.
Fig. 4
Fig. 4
Prediction performance of MLDPS in the GSE21257 and merged cohorts. Time-dependent ROC analysis for predicting OS in the (A) GSE21257 and (D) merged cohorts. AUC values of the time-dependent ROC curves in the (B) GSE21257 and (E) merged cohorts. KM curves for OS in the (C) GSE21257 and (F) merged cohorts.
Fig. 5
Fig. 5
Comparisons of C-indexes values between MLDPS and 10 expression-based signatures. C-indexes values of MLDPS and 10 published signatures in the TARGET, GSE21257, and merged cohorts. *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.
Fig. 6
Fig. 6
The clinical signature of MLDPS and other signatures in the merged cohort. (A) Univariate Cox regression analysis of MLDPS and other signatures in the merged cohort. (B) Comparison of osteosarcoma between the high- and low-risk score groups. Left: Differences in the MLDPS risk scores between the OS groups. Right: Composition percentage of the high- and low-risk score groups in osteosarcoma. (C) Comparison of metastasis between the high- and low-risk score groups. Left: Differences in the MLDPS risk scores between the metastasis groups. Right: Composition percentage of the high- and low-risk score groups in metastasis.
Fig. 7
Fig. 7
Function and pathway analysis of MLDPS. (A) Heatmap of the top 50 positive correlation genes with the MLDPS risk score; (B) Heatmap of the top 50 negative correlation genes with the MLDPS risk score; The GO enrichment analysis of the top 500 positive correlation genes, including BP (C), CC (D) and MF (E); (F) The KEGG pathways enrichment analysis of the top 500 positive correlation genes.
Fig. 8
Fig. 8
Landscapes of the TME between the high- and low-risk score groups. (A) Heatmap indicates the difference in the TME between the high- and low-risk score groups. (B) The diversities of the ESTIMATEscore, ImmuneScore, MicroenvironmentScore, and StromalScore between the high- and low-risk score groups. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.
Fig. 9
Fig. 9
The correlations between MLDPS and gene expression of chemokines and their receptors. (A) Correlation heatmap between MLDPS and expression of chemokines and their receptors. (B) Correlation analysis of MLDPS and CCL8, CSF1, CCR1, CCR2, CCR4 and CCR7 gene expression. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.
Fig. 10
Fig. 10
Immunotherapeutic response prediction by TIDE. (A) Patients were classified as potential responders or non-responders to immunotherapy based on their TIDE values in the merged cohort. (B) Differences in Dysfunction scores between high- and low-risk score groups (P < 0.001). (C) Percent weight of clinical response to immunotherapeutic in high- and low-risk score groups. (D) Differences in TIDE scores between high- and low-risk score groups (P = 0.02).
Fig. 11
Fig. 11
Association between the risk model and drug sensitivity in osteosarcoma. Drug sensitivity of osteosarcoma patients with high- and low-risk score. *P < 0.05, **P < 0.01, ***P < 0.001.

Similar articles

References

    1. Bielack, S. S. et al. Prognostic factors in high-grade osteosarcoma of the extremities or trunk: an analysis of 1,702 patients treated on neoadjuvant cooperative osteosarcoma study group protocols. J. Clin. Oncol.20 (3), 776–790 (2002). - PubMed
    1. Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2020. CA Cancer J. Clin.70 (1), 7–30 (2020). - PubMed
    1. Marchandet, L. et al. Mechanisms of resistance to conventional therapies for osteosarcoma. Cancers13 (4), 683 (2021). - PMC - PubMed
    1. Rainusso, N., Wang, L. L. & Yustein, J. T. The adolescent and young adult with cancer: state of the Art -- bone tumors. Curr. Oncol. Rep.15 (4), 296–307 (2013). - PubMed
    1. Mirabello, L., Troisi, R. J. & Savage, S. A. Osteosarcoma incidence and survival rates from 1973 to 2004: data from the surveillance, epidemiology, and end results program. Cancer115 (7), 1531–1543 (2009). - PMC - PubMed

Substances

LinkOut - more resources