Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2024 Sep 27;14(1):22203.
doi: 10.1038/s41598-024-72790-5.

Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data

Affiliations
Comparative Study

Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data

Anjana Eledath Kolasseri et al. Sci Rep. .

Abstract

Cervical cancer is a common malignant tumor of the female reproductive system and the leading cause of death among women worldwide. The survival prediction method can be used to effectively analyze the time to event, which is essential in any clinical study. This study aims to bridge the gap between traditional statistical methods and machine learning in survival analysis by revealing which techniques are most effective in predicting survival, with a particular emphasis on improving prediction accuracy and identifying key risk factors for cervical cancer. Women with cervical cancer diagnosed between 2013 and 2015 were included in our study using data from the Surveillance, Epidemiology, and End Results (SEER) database. Using this dataset, the study assesses the performance of Weibull, Cox proportional hazards models, and Random Survival Forests in terms of predictive accuracy and risk factor identification. The findings reveal that machine learning models, particularly Random Survival Forests (RSF), outperform traditional statistical methods in both predictive accuracy and the discernment of crucial prognostic factors, underscoring the advantages of machine learning in handling complex survival data. However, for a survival dataset with a small number of predictors, statistical models should be used first. The study finds that RSF models enhance survival analysis with more accurate predictions and insights into survival risk factors but highlights the need for larger datasets and further research on model interpretability and clinical applicability.

Keywords: Cervical cancer; Machine learning; Prognostic factors; Statistical methods; Survival analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Survival time (months) for cervical cancer patients in the SEER database (2013–2015).
Fig. 2
Fig. 2
Survival curves of Cox Proportional Hazards (Cox PH), Weibull, and Random Survival Forest (RSF) models compared against the observed Kaplan-Meier survival curve.
Fig. 3
Fig. 3
Variable Importance plot for (a) Weibull model (b) Cox- Proportional model (c) Random Survival Forest.
Fig. 4
Fig. 4
Residual plot for the Weibull model.

Similar articles

Cited by

References

    1. Bhatla, N., Aoki, D., Sharma, D. N. & Sankaranarayanan, R. Cancer of the cervix uteri: 2021 update. Int. J. Gynecol. Obstet.155, 28–44 (2021). - PMC - PubMed
    1. Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin.71, 209–249 (2021). - PubMed
    1. World Health Organization and others. Global Strategy to Accelerate the Elimination of Cervical Cancer as a Public Health Problem (World Health Organization, 2020).
    1. Siegel, R. L., Miller, K. D., Fuchs, H. E. & Jemal, A. Cancer statistics, 2022. CA Cancer J. Clin.72, 7–33 (2022). - PubMed
    1. Vinh-Hung, V. et al. Prognostic value of histopathology and trends in cervical cancer: A SEER population study. BMC Cancer7, 164 (2007). - PMC - PubMed

Publication types