Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 18:15:1605378.
doi: 10.3389/fonc.2025.1605378. eCollection 2025.

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Affiliations

Development and validation of hybrid machine learning approach for predicting survival in patients with cervical cancer: a SEER-based population study

Anjana Eledath Kolasseri et al. Front Oncol. .

Abstract

Background: Accurate survival prediction in cervical cancer is crucial for personalized therapy, particularly in high-risk groups where early intervention might enhance results. The study aims to create a hybrid survival model that integrates Cox Proportional Hazards (CoxPH) with Elastic Net regularization and Random Survival Forest (RSF) to improve prediction accuracy and interpretability.

Methods: Data from the SEER database (2013-2015) were pre-processed through normalization and encoding. RSF recorded non-linear interactions between covariates, while the CoxPH Elastic Net Regularization model provided linear interpretability and identified key variables. Model parameters were optimized using cross-validation, and final performance was assessed on an independent test set using metrics including C-index, Integrated Brier Score (IBS), AUC-ROC, and calibration plots.

Results: The hybrid model outperformed the individual models with an Integrated Brier Score (IBS) of 0.13 and a concordance index (C-index) of 0.82. With an AUC-ROC of 0.84, the model provided robust calibration and classification performance on the independent test set, effectively separating between individuals at high and low risk.

Conclusion: The hybrid model provides a promising tool for personalized risk stratification in cervical cancer based on survival probability. Further testing in varied clinical categories is recommended to confirm its efficiency in precision oncology.

Keywords: SEER database; cervical cancer; hybrid models; machine learning; survival models.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Comparison of predicted survival curves from Cox with Elastic Net Regularization, RSF, and Hybrid models. This comparison demonstrates the models’ ability to identify survival patterns while distinguishing long-term risk.
Figure 2
Figure 2
Calibration plot comparing predicted and observed survival probabilities for Cox with Elastic Net Regularization, RSF, and Hybrid models. The plot represents the association between predicted survival probability (x-axis) and actual survival outcomes (y-axis) across risk categories. The diagonal line indicates perfect calibration. The Hybrid model (green) closely follows the diagonal, showing better calibration throughout the probability range than the Cox Elastic Net Regularization (red) and RSF (blue), which show overestimation and underestimation in certain probability bins. This highlights the hybrid model’s ability to produce well-calibrated survival predictions.
Figure 3
Figure 3
Calibration plot for the Hybrid model on the independent test set, which indicates the agreement between predicted survival probabilities (x-axis) and observed survival outcomes (y-axis) using test data. The green dashed line shows the Hybrid model’s calibration curve, while the black diagonal line denotes perfect calibration. The Hybrid model closely aligns with the ideal line, especially in the higher probability range (≥0.7), showing strong reliability and calibration of the model’s predictions in unseen data.
Figure 4
Figure 4
Survival curve generated by the Hybrid model on the independent test set. The curve illustrates the predicted survival probability over time (in months) for the independent test set using the Hybrid model, which highlights the model’s capacity to provide clinically relevant survival outcomes for patient risk stratification.
Figure 5
Figure 5
The bar plot shows the top five features contributing to the Hybrid model based on weighted importance scores, which collectively contributed most to the model’s survival risk estimation.
Figure 6
Figure 6
Time-dependent AUC-ROC curves at 60 and 120 months for the hybrid model. AUC at 60 months: 0.84 (95% CI: 0.81–0.87); AUC at 120 months: 0.82 (95% CI: 0.78–0.85). Confidence intervals computed using 1,000 bootstrap resamples.

Similar articles

References

    1. Sun P, Yu C, Yin L, Chen Y, Sun Z, Zhang T, et al. Global, regional, and national burden of female cancers in women of child-bearing age, 1990–2021: analysis of data from the global burden of disease study 2021. EClinicalMedicine. (2024) 74:102713. doi: 10.1016/j.eclinm.2024.102713 - DOI - PMC - PubMed
    1. Jallah JK, Anjankar A, Nankong FA. Public health approach in the elimination and control of cervical cancer: A review. Cureus. (2023) 15(9). doi: 10.7759/cureus.44543 - DOI - PMC - PubMed
    1. Xu C, Ma T, Sun H, Li X, Gao S. Markers of prognosis for early stage cervical cancer patients (Stage IB1, IB2) undergoing surgical treatment. Front Oncol. (2021) 11:659313. doi: 10.3389/fonc.2021.659313 - DOI - PMC - PubMed
    1. Kolasseri AE, B V. Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data. Sci Rep. (2024) 14:22203. doi: 10.1038/s41598-024-72790-5 - DOI - PMC - PubMed
    1. Devarajan K, Ebrahimi N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications. Comput Stat Data Anal. (2011) 55:667–76. doi: 10.1016/j.csda.2010.06.010 - DOI - PMC - PubMed

LinkOut - more resources