Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study
- PMID: 38389946
- PMCID: PMC10881792
- DOI: 10.3389/fpubh.2024.1211220
Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study
Abstract
Aims: As people's standard of living improves, the incidence of colorectal cancer is increasing, and colorectal cancer hospitalization costs are relatively high. Therefore, predicting the cost of hospitalization for colorectal cancer patients can provide guidance for controlling healthcare costs and for the development of related policies.
Methods: This study used the first page of medical record data on colorectal cancer inpatient cases of a tertiary first-class hospital in Shenzhen from 2018 to 2022. The impacting factors of hospitalization costs for colorectal cancer were analyzed. Random forest and support vector regression models were used to establish predictive models of the cost of hospitalization for colorectal cancer patients and to compare and evaluate.
Results: In colorectal cancer inpatients, major procedures, length of stay, level of procedure, Charlson comorbidity index, age, and medical payment method were the important influencing factors. In terms of the test set, the R2 of the Random forest model was 0.833, the R2 of the Support vector regression model was 0.824; the root mean square error (RMSE) of the Random forest model was 0.029, and the RMSE of the Support vector regression model was 0.032. In the Random Forest model, the weight of the major procedure was the highest (0.286).
Conclusion: Major procedures and length of stay have the greatest impacts on hospital costs for colorectal cancer patients. The random forest model is a better method to predict the hospitalization costs for colorectal cancer patients than the support vector regression.
Keywords: colorectal cancer; hospitalization costs; influencing factors; random forest; support vector regression.
Copyright © 2024 Gao and Liu.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Similar articles
-
Retrospective Study on the Influencing Factors and Prediction of Hospitalization Expenses for Chronic Renal Failure in China Based on Random Forest and LASSO Regression.Front Public Health. 2021 Jun 15;9:678276. doi: 10.3389/fpubh.2021.678276. eCollection 2021. Front Public Health. 2021. PMID: 34211956 Free PMC article.
-
Predicting hospitalization costs for pulmonary tuberculosis patients based on machine learning.BMC Infect Dis. 2024 Aug 28;24(1):875. doi: 10.1186/s12879-024-09771-6. BMC Infect Dis. 2024. PMID: 39198742 Free PMC article.
-
Development of a System for Predicting Hospitalization Time for Patients With Traumatic Brain Injury Based on Machine Learning Algorithms: User-Centered Design Case Study.JMIR Hum Factors. 2024 Aug 30;11:e62866. doi: 10.2196/62866. JMIR Hum Factors. 2024. PMID: 39212592 Free PMC article.
-
Machine-learning-based cost prediction models for inpatients with mental disorders in China.BMC Psychiatry. 2025 Jan 9;25(1):33. doi: 10.1186/s12888-024-06358-y. BMC Psychiatry. 2025. PMID: 39789477 Free PMC article.
-
Predicting length of stay and mortality among hospitalized patients with type 2 diabetes mellitus and hypertension.Int J Med Inform. 2021 Oct;154:104569. doi: 10.1016/j.ijmedinf.2021.104569. Epub 2021 Sep 4. Int J Med Inform. 2021. PMID: 34525441
Cited by
-
Cost-effectiveness of the 3E model in diabetes management: a machine learning approach to assess long-term economic impact.Front Public Health. 2025 May 23;13:1571546. doi: 10.3389/fpubh.2025.1571546. eCollection 2025. Front Public Health. 2025. PMID: 40487535 Free PMC article.
-
Design of upper limb muscle strength assessment system based on surface electromyography signals and joint motion.Front Neurol. 2024 Dec 13;15:1470759. doi: 10.3389/fneur.2024.1470759. eCollection 2024. Front Neurol. 2024. PMID: 39734626 Free PMC article.
-
Development of an upper limb muscle strength rehabilitation assessment system using particle swarm optimisation.Front Bioeng Biotechnol. 2025 Jul 9;13:1619411. doi: 10.3389/fbioe.2025.1619411. eCollection 2025. Front Bioeng Biotechnol. 2025. PMID: 40704098 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Research Materials