Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 8:12:1211220.
doi: 10.3389/fpubh.2024.1211220. eCollection 2024.

Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study

Affiliations

Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study

Jun Gao et al. Front Public Health. .

Abstract

Aims: As people's standard of living improves, the incidence of colorectal cancer is increasing, and colorectal cancer hospitalization costs are relatively high. Therefore, predicting the cost of hospitalization for colorectal cancer patients can provide guidance for controlling healthcare costs and for the development of related policies.

Methods: This study used the first page of medical record data on colorectal cancer inpatient cases of a tertiary first-class hospital in Shenzhen from 2018 to 2022. The impacting factors of hospitalization costs for colorectal cancer were analyzed. Random forest and support vector regression models were used to establish predictive models of the cost of hospitalization for colorectal cancer patients and to compare and evaluate.

Results: In colorectal cancer inpatients, major procedures, length of stay, level of procedure, Charlson comorbidity index, age, and medical payment method were the important influencing factors. In terms of the test set, the R2 of the Random forest model was 0.833, the R2 of the Support vector regression model was 0.824; the root mean square error (RMSE) of the Random forest model was 0.029, and the RMSE of the Support vector regression model was 0.032. In the Random Forest model, the weight of the major procedure was the highest (0.286).

Conclusion: Major procedures and length of stay have the greatest impacts on hospital costs for colorectal cancer patients. The random forest model is a better method to predict the hospitalization costs for colorectal cancer patients than the support vector regression.

Keywords: colorectal cancer; hospitalization costs; influencing factors; random forest; support vector regression.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
SVR model tuning.

Similar articles

Cited by

References

    1. Dekker E, Tanis PJ, Vleugels JLA, Kasi PM, Wallace MB. Colorectal cancer. Lancet. (2019) 394:1467–80. doi: 10.1016/s0140-6736(19)32319-0 - DOI - PubMed
    1. Araghi M, Soerjomataram I, Jenkins M, Brierley J, Morris E, Bray F, et al. . Global trends in colorectal cancer mortality: projections to the year 2035. Int J Cancer. (2019) 144:2992–3000. doi: 10.1002/ijc.32055 - DOI - PubMed
    1. Cao M, Li H, Sun D, Chen W. Cancer burden of major cancers in China: a need for sustainable actions. Cancer Commun. (2020) 40:205–10. doi: 10.1002/cac2.12025 - DOI - PMC - PubMed
    1. Yuan G-L, Liang L-Z, Zhang Z-F, Liang Q-L, Huang Z-Y, Zhang H-J, et al. . Hospitalization costs of treating colorectal cancer in China: a retrospective analysis. Medicine. (2019) 98:33. doi: 10.1097/MD.0000000000016718 - DOI - PMC - PubMed
    1. Viale PH. The american cancer society's facts and figures: 2020 edition. J Adv Pract Oncol. (2020) 11:135–6. doi: 10.6004/jadpro.2020.11.2.1 - DOI - PMC - PubMed

Publication types