Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

doi:10.3389/fonc.2025.1605378

. 2025 Jun 18:15:1605378.

doi: 10.3389/fonc.2025.1605378. eCollection 2025.

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Anjana Eledath Kolasseri¹, Venkataramana B¹

Affiliations

PMID: 40606977
PMCID: PMC12213391
DOI: 10.3389/fonc.2025.1605378

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Anjana Eledath Kolasseri et al. Front Oncol. 2025.

. 2025 Jun 18:15:1605378.

doi: 10.3389/fonc.2025.1605378. eCollection 2025.

Authors

Anjana Eledath Kolasseri¹, Venkataramana B¹

Affiliation

¹ School of Advanced Sciences, Vellore Institute of Technology, Vellore Tamil Nadu, India.

PMID: 40606977
PMCID: PMC12213391
DOI: 10.3389/fonc.2025.1605378

Abstract

Background: Accurate survival prediction in cervical cancer is crucial for personalized therapy, particularly in high-risk groups where early intervention might enhance results. The study aims to create a hybrid survival model that integrates Cox Proportional Hazards (CoxPH) with Elastic Net regularization and Random Survival Forest (RSF) to improve prediction accuracy and interpretability.

Methods: Data from the SEER database (2013-2015) were pre-processed through normalization and encoding. RSF recorded non-linear interactions between covariates, while the CoxPH Elastic Net Regularization model provided linear interpretability and identified key variables. Model parameters were optimized using cross-validation, and final performance was assessed on an independent test set using metrics including C-index, Integrated Brier Score (IBS), AUC-ROC, and calibration plots.

Results: The hybrid model outperformed the individual models with an Integrated Brier Score (IBS) of 0.13 and a concordance index (C-index) of 0.82. With an AUC-ROC of 0.84, the model provided robust calibration and classification performance on the independent test set, effectively separating between individuals at high and low risk.

Conclusion: The hybrid model provides a promising tool for personalized risk stratification in cervical cancer based on survival probability. Further testing in varied clinical categories is recommended to confirm its efficiency in precision oncology.

Keywords: SEER database; cervical cancer; hybrid models; machine learning; survival models.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Comparison of predicted survival curves from Cox with Elastic Net Regularization, RSF, and Hybrid models. This comparison demonstrates the models’ ability to identify survival patterns while distinguishing long-term risk.

**Figure 2**
Calibration plot comparing predicted and observed survival probabilities for Cox with Elastic Net Regularization, RSF, and Hybrid models. The plot represents the association between predicted survival probability (x-axis) and actual survival outcomes (y-axis) across risk categories. The diagonal line indicates perfect calibration. The Hybrid model (green) closely follows the diagonal, showing better calibration throughout the probability range than the Cox Elastic Net Regularization (red) and RSF (blue), which show overestimation and underestimation in certain probability bins. This highlights the hybrid model’s ability to produce well-calibrated survival predictions.

**Figure 3**
Calibration plot for the Hybrid model on the independent test set, which indicates the agreement between predicted survival probabilities (x-axis) and observed survival outcomes (y-axis) using test data. The green dashed line shows the Hybrid model’s calibration curve, while the black diagonal line denotes perfect calibration. The Hybrid model closely aligns with the ideal line, especially in the higher probability range (≥0.7), showing strong reliability and calibration of the model’s predictions in unseen data.

**Figure 4**
Survival curve generated by the Hybrid model on the independent test set. The curve illustrates the predicted survival probability over time (in months) for the independent test set using the Hybrid model, which highlights the model’s capacity to provide clinically relevant survival outcomes for patient risk stratification.

**Figure 5**
The bar plot shows the top five features contributing to the Hybrid model based on weighted importance scores, which collectively contributed most to the model’s survival risk estimation.

**Figure 6**
Time-dependent AUC-ROC curves at 60 and 120 months for the hybrid model. AUC at 60 months: 0.84 (95% CI: 0.81–0.87); AUC at 120 months: 0.82 (95% CI: 0.78–0.85). Confidence intervals computed using 1,000 bootstrap resamples.

See this image and copyright information in PMC

References

1. Sun P, Yu C, Yin L, Chen Y, Sun Z, Zhang T, et al. Global, regional, and national burden of female cancers in women of child-bearing age, 1990–2021: analysis of data from the global burden of disease study 2021. EClinicalMedicine. (2024) 74:102713. doi: 10.1016/j.eclinm.2024.102713 - DOI - PMC - PubMed
1. Jallah JK, Anjankar A, Nankong FA. Public health approach in the elimination and control of cervical cancer: A review. Cureus. (2023) 15(9). doi: 10.7759/cureus.44543 - DOI - PMC - PubMed
1. Xu C, Ma T, Sun H, Li X, Gao S. Markers of prognosis for early stage cervical cancer patients (Stage IB1, IB2) undergoing surgical treatment. Front Oncol. (2021) 11:659313. doi: 10.3389/fonc.2021.659313 - DOI - PMC - PubMed
1. Kolasseri AE, B V. Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data. Sci Rep. (2024) 14:22203. doi: 10.1038/s41598-024-72790-5 - DOI - PMC - PubMed
1. Devarajan K, Ebrahimi N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications. Comput Stat Data Anal. (2011) 55:667–76. doi: 10.1016/j.csda.2010.06.010 - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central

[1] Sun P, Yu C, Yin L, Chen Y, Sun Z, Zhang T, et al. Global, regional, and national burden of female cancers in women of child-bearing age, 1990–2021: analysis of data from the global burden of disease study 2021. EClinicalMedicine. (2024) 74:102713. doi: 10.1016/j.eclinm.2024.102713 - DOI - PMC - PubMed

[2] Sun P, Yu C, Yin L, Chen Y, Sun Z, Zhang T, et al. Global, regional, and national burden of female cancers in women of child-bearing age, 1990–2021: analysis of data from the global burden of disease study 2021. EClinicalMedicine. (2024) 74:102713. doi: 10.1016/j.eclinm.2024.102713 - DOI - PMC - PubMed

[3] Jallah JK, Anjankar A, Nankong FA. Public health approach in the elimination and control of cervical cancer: A review. Cureus. (2023) 15(9). doi: 10.7759/cureus.44543 - DOI - PMC - PubMed

[4] Jallah JK, Anjankar A, Nankong FA. Public health approach in the elimination and control of cervical cancer: A review. Cureus. (2023) 15(9). doi: 10.7759/cureus.44543 - DOI - PMC - PubMed

[5] Xu C, Ma T, Sun H, Li X, Gao S. Markers of prognosis for early stage cervical cancer patients (Stage IB1, IB2) undergoing surgical treatment. Front Oncol. (2021) 11:659313. doi: 10.3389/fonc.2021.659313 - DOI - PMC - PubMed

[6] Xu C, Ma T, Sun H, Li X, Gao S. Markers of prognosis for early stage cervical cancer patients (Stage IB1, IB2) undergoing surgical treatment. Front Oncol. (2021) 11:659313. doi: 10.3389/fonc.2021.659313 - DOI - PMC - PubMed

[7] Kolasseri AE, B V. Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data. Sci Rep. (2024) 14:22203. doi: 10.1038/s41598-024-72790-5 - DOI - PMC - PubMed

[8] Kolasseri AE, B V. Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data. Sci Rep. (2024) 14:22203. doi: 10.1038/s41598-024-72790-5 - DOI - PMC - PubMed

[9] Devarajan K, Ebrahimi N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications. Comput Stat Data Anal. (2011) 55:667–76. doi: 10.1016/j.csda.2010.06.010 - DOI - PMC - PubMed

[10] Devarajan K, Ebrahimi N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications. Comput Stat Data Anal. (2011) 55:667–76. doi: 10.1016/j.csda.2010.06.010 - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Affiliation

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Related information

LinkOut - more resources

Full Text Sources