Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients

doi:10.1007/s10479-022-04984-x

. 2022 Sep 29:1-29.

doi: 10.1007/s10479-022-04984-x. Online ahead of print.

Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients

Sara Saadatmand¹, Khodakaram Salimifard¹, Reza Mohammadi², Alex Kuiper², Maryam Marzban³, Akram Farhadi⁴

Affiliations

¹ Computational Intelligence and Intelligent Optimization Research Group, Persian Gulf University, Bushehr, 75169 Iran.
² Section Business Analytics, Amsterdam Business School, University of Amsterdam, Amsterdam, The Netherlands.
³ Department of Public Health, School of Public Health, Bushehr University of Medical Science, Bushehr, Iran.
⁴ The Persian Gulf Tropical Medicine Research Center, The Persian Gulf Biomedical Science Research Institute, Bushehr University of Medical Science, Bushehr, Iran.

PMID: 36196268
PMCID: PMC9521862
DOI: 10.1007/s10479-022-04984-x

Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients

Sara Saadatmand et al. Ann Oper Res. 2022.

. 2022 Sep 29:1-29.

doi: 10.1007/s10479-022-04984-x. Online ahead of print.

Authors

Sara Saadatmand¹, Khodakaram Salimifard¹, Reza Mohammadi², Alex Kuiper², Maryam Marzban³, Akram Farhadi⁴

Affiliations

¹ Computational Intelligence and Intelligent Optimization Research Group, Persian Gulf University, Bushehr, 75169 Iran.
² Section Business Analytics, Amsterdam Business School, University of Amsterdam, Amsterdam, The Netherlands.
³ Department of Public Health, School of Public Health, Bushehr University of Medical Science, Bushehr, Iran.
⁴ The Persian Gulf Tropical Medicine Research Center, The Persian Gulf Biomedical Science Research Institute, Bushehr University of Medical Science, Bushehr, Iran.

PMID: 36196268
PMCID: PMC9521862
DOI: 10.1007/s10479-022-04984-x

Abstract

The recent COVID-19 pandemic has affected health systems across the world. Especially, Intensive Care Units (ICUs) have played a pivotal role in the treatment of critically-ill patients. At the same time however, the increasing number of admissions due to the vast prevalence of the virus have caused several problems for ICU wards such as overburdening of staff and shortages of medical resources. These issues might have affected the quality of healthcare services provided directly impacting a patient's survival. The objective of this research is to leverage Machine Learning (ML) on hospital data in order to support hospital managers and practitioners with the treatment of COVID-19 patients. This is accomplished by providing more detailed inference about a patient's likelihood of ICU admission, mortality and in case of hospitalization the length of stay (LOS). In this pursuit, the outcome variables are in three separate models predicted by five different ML algorithms: eXtreme Gradient Boosting (XGB), K-Nearest Neighbor (KNN), Random Forest (RF), bagged-CART (b-CART), and LogitBoost (LB). With the exception of KNN, the studied models show good predictive capabilities when evaluating relevant accuracy scores, such as area under the curve. By implementing an ensemble stacking approach (either a Neural Net or a General Linear Model) on top of the aforementioned ML algorithms the performance is further boosted. Ultimately, for the prediction of admission to the ICU, the ensemble stacking via a Neural Net achieved the best result with an accuracy of over 95%. For mortality at the ICU, the vanilla XGB performed slightly better (1% difference with the meta-model). To predict large length of stays both ensemble stacking approaches yield comparable results. Besides it direct implications for managing COVID-19 patients, the approach presented serves as an example how data can be employed in future pandemics or crises.

Keywords: COVID-19 pandemic; Ensemble modeling; ML in health systems; Supervised learning.

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

PubMed Disclaimer

Conflict of interest statement

Conflict of interestsThe authors declare that they have no conflict of interest.

Figures

**Fig. 1**
Architecture of the framework of this study

**Fig. 2**
The selection of cases of ICU admitted COVID-19 patients

**Fig. 3**
Percentage of missing values of the dataset. Abbreviations: LDH (lactate dehydrogenase), T.B. (total bilirubin), ESR (erythrocyte sedimentation rate), AST (aspartate aminotransferase), ALT (alanine transaminase), L.disease (chronic lung disease), Nd.disease (chronic neurological disorder), K.disease (chronic kidney disease), S.cough (sputum cough), A.pain (abdominal pain), H.disease (heart disease), INR ((international normalized ratio), High.bp (high blood pressure), PT (prothrombin time), O2.s (O2 saturation), WBC (white blood cells) count, R.rate (respiratory rate), Diastolic (diastolic pressure), Temp (temperature), H.rate (heart rate), Systolic (systolic pressure), ARI (acute respiratory infection), NCD (non communicable diseases)

**Fig. 4**
Kernel Density Estimation of initial and imputed data for some of the variables. The red curves denote the imputed data distribution and the blue curves demonstrate the distribution of initial data. Abbreviations: Temp (temperature), H.rate (heart rate), R.rate (respiratory rate), Systolic (systolic pressure), Diastolic (diastolic pressure), O2.s (O2 saturation), Fever.H (history of fever), PT (prothrombin time), INR (international normalized ratio), ALT (alanine transaminase), LDH (lactate dehydrogenase), ESR (erythrocyte sedimentation rate)

**Fig. 5**
The Boruta algorithm feature selection for ICU admission. Green boxes denote the confirmed features, yellow boxes represent the tentative variables, blue boxes illustrate the minimum, average, and maximum of shadow variables, and red boxes show the irrelevant features

**Fig. 6**
The Boruta algorithm feature selection for ICU mortality. Green boxes denote the confirmed features, yellow boxes represent the tentative variables, blue boxes illustrate the minimum, average, and maximum of shadow variables, and red boxes show the irrelevant features

**Fig. 7**
The Boruta algorithm feature selection for ICU LOS. Green boxes denote the confirmed features, yellow boxes represent the tentative variables, blue boxes illustrate the minimum, average, and maximum of shadow variables, and red boxes show the irrelevant features

**Fig. 8**
The ROC curves of five ML algorithms for ICU admission. Abbreviations: XGB (extreme gradient boosting), KNN (k-nearest neighbor), RF (random forest), b-CART (bagged CART), LB (LogitBoost)

**Fig. 9**
The ROC curves of five ML algorithms for ICU mortality. Abbreviations: XGB (extreme gradient boosting), KNN (k-nearest neighbor), RF (random forest), b-CART (bagged CART), LB (LogitBoost)

**Fig. 10**
The ROC curves of five ML algorithms for ICU LOS. Abbreviations: XGB (extreme gradient boosting), KNN (k-nearest neighbor), RF (random forest), b-CART (bagged CART), LB (LogitBoost)

**Fig. 11**
the overall concept of the ensemble method

See this image and copyright information in PMC

Cited by

A comparative study of neuro-fuzzy and neural network models in predicting length of stay in university hospital.
Yabana Kiremit B, Dikmetaş Yardan E. Yabana Kiremit B, et al. BMC Health Serv Res. 2025 Mar 27;25(1):446. doi: 10.1186/s12913-025-12623-x. BMC Health Serv Res. 2025. PMID: 40148882 Free PMC article.
Predictors of Medical and Dental Clinic Closure by Machine Learning Methods: Cross-Sectional Study Using Empirical Data.
Park YT, Kim D, Jeon JS, Kim KG. Park YT, et al. J Med Internet Res. 2024 Aug 30;26:e46608. doi: 10.2196/46608. J Med Internet Res. 2024. PMID: 39213534 Free PMC article.
PSO-XnB: a proposed model for predicting hospital stay of CAD patients.
Miriyala GP, Sinha AK. Miriyala GP, et al. Front Artif Intell. 2024 May 3;7:1381430. doi: 10.3389/frai.2024.1381430. eCollection 2024. Front Artif Intell. 2024. PMID: 38765633 Free PMC article.
Analyzing COVID-19 progression with Markov multistage models: insights from a Korean cohort.
Ndagijimana FAR, Park T. Ndagijimana FAR, et al. Genomics Inform. 2025 Jan 27;23(1):2. doi: 10.1186/s44342-024-00035-y. Genomics Inform. 2025. PMID: 39891219 Free PMC article.
We Need to Talk About Lung Ultrasound Score: Prediction of Intensive Care Unit Admission with Machine Learning.
Oliveira-Saraiva D, Leote J, Gonzalez FA, Garcia NC, Ferreira HA. Oliveira-Saraiva D, et al. J Imaging. 2025 Feb 7;11(2):45. doi: 10.3390/jimaging11020045. J Imaging. 2025. PMID: 39997547 Free PMC article.

See all "Cited by" articles

References

1. Abu Alfeilat HA, et al. Effects of distance measure choice on K-nearest neighbor classifier performance: A review. Big Data. 2019;7(4):221–248. doi: 10.1089/big.2018.0175. - DOI - PubMed
1. Alazzam I, Alsmadi I, Akour M. Software fault proneness prediction: A comparative study between bagging, boosting, and stacking ensemble and base learner methods. International Journal of Data Analysis Techniques and Strategies. 2017;9(1):1. doi: 10.1504/IJDATS.2017.10003991. - DOI
1. Alinaghian M, Goli A. Location, allocation and routing of temporary health centers in rural areas in crisis, solved by improved harmony search algorithm. International Journal of Computational Intelligence Systems. 2017;10(1):894. doi: 10.2991/ijcis.2017.10.1.60. - DOI
1. Altini N, et al. Predictive machine learning models and survival analysis for COVID-19 prognosis based on hematochemical parameters. Sensors. 2021;21(24):8503. doi: 10.3390/s21248503. - DOI - PMC - PubMed
1. Araç S, Özel M. A new parameter for predict the clinical outcome of patients with COVID-19 pneumonia: The direct/total bilirubin ratio. International Journal of Clinical Practice. 2021 doi: 10.1111/ijcp.14557. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

[1] Abu Alfeilat HA, et al. Effects of distance measure choice on K-nearest neighbor classifier performance: A review. Big Data. 2019;7(4):221–248. doi: 10.1089/big.2018.0175. - DOI - PubMed

[2] Abu Alfeilat HA, et al. Effects of distance measure choice on K-nearest neighbor classifier performance: A review. Big Data. 2019;7(4):221–248. doi: 10.1089/big.2018.0175. - DOI - PubMed

[3] Alazzam I, Alsmadi I, Akour M. Software fault proneness prediction: A comparative study between bagging, boosting, and stacking ensemble and base learner methods. International Journal of Data Analysis Techniques and Strategies. 2017;9(1):1. doi: 10.1504/IJDATS.2017.10003991. - DOI

[4] Alazzam I, Alsmadi I, Akour M. Software fault proneness prediction: A comparative study between bagging, boosting, and stacking ensemble and base learner methods. International Journal of Data Analysis Techniques and Strategies. 2017;9(1):1. doi: 10.1504/IJDATS.2017.10003991. - DOI

[5] Alinaghian M, Goli A. Location, allocation and routing of temporary health centers in rural areas in crisis, solved by improved harmony search algorithm. International Journal of Computational Intelligence Systems. 2017;10(1):894. doi: 10.2991/ijcis.2017.10.1.60. - DOI

[6] Alinaghian M, Goli A. Location, allocation and routing of temporary health centers in rural areas in crisis, solved by improved harmony search algorithm. International Journal of Computational Intelligence Systems. 2017;10(1):894. doi: 10.2991/ijcis.2017.10.1.60. - DOI

[7] Altini N, et al. Predictive machine learning models and survival analysis for COVID-19 prognosis based on hematochemical parameters. Sensors. 2021;21(24):8503. doi: 10.3390/s21248503. - DOI - PMC - PubMed

[8] Altini N, et al. Predictive machine learning models and survival analysis for COVID-19 prognosis based on hematochemical parameters. Sensors. 2021;21(24):8503. doi: 10.3390/s21248503. - DOI - PMC - PubMed

[9] Araç S, Özel M. A new parameter for predict the clinical outcome of patients with COVID-19 pneumonia: The direct/total bilirubin ratio. International Journal of Clinical Practice. 2021 doi: 10.1111/ijcp.14557. - DOI - PMC - PubMed

[10] Araç S, Özel M. A new parameter for predict the clinical outcome of patients with COVID-19 pneumonia: The direct/total bilirubin ratio. International Journal of Clinical Practice. 2021 doi: 10.1111/ijcp.14557. - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients

Affiliations

Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources