Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 20;18(10):e0292888.
doi: 10.1371/journal.pone.0292888. eCollection 2023.

Machine learning-based prediction models for home discharge in patients with COVID-19: Development and evaluation using electronic health records

Affiliations

Machine learning-based prediction models for home discharge in patients with COVID-19: Development and evaluation using electronic health records

Ruben D Zapata et al. PLoS One. .

Abstract

Objective: This study aimed to develop and validate predictive models using electronic health records (EHR) data to determine whether hospitalized COVID-19-positive patients would be admitted to alternative medical care or discharged home.

Methods: We conducted a retrospective cohort study using deidentified data from the University of Florida Health Integrated Data Repository. The study included 1,578 adult patients (≥18 years) who tested positive for COVID-19 while hospitalized, comprising 960 (60.8%) female patients with a mean (SD) age of 51.86 (18.49) years and 618 (39.2%) male patients with a mean (SD) age of 54.35 (18.48) years. Machine learning (ML) model training involved cross-validation to assess their performance in predicting patient disposition.

Results: We developed and validated six supervised ML-based prediction models (logistic regression, Gaussian Naïve Bayes, k-nearest neighbors, decision trees, random forest, and support vector machine classifier) to predict patient discharge status. The models were evaluated based on the area under the receiver operating characteristic curve (ROC-AUC), precision, accuracy, F1 score, and Brier score. The random forest classifier exhibited the highest performance, achieving an accuracy of 0.84 and an AUC of 0.72. Logistic regression (accuracy: 0.85, AUC: 0.71), k-nearest neighbor (accuracy: 0.84, AUC: 0.63), decision tree (accuracy: 0.84, AUC: 0.61), Gaussian Naïve Bayes (accuracy: 0.84, AUC: 0.66), and support vector machine classifier (accuracy: 0.84, AUC: 0.67) also demonstrated valuable predictive capabilities.

Significance: This study's findings are crucial for efficiently allocating healthcare resources during pandemics like COVID-19. By harnessing ML techniques and EHR data, we can create predictive tools to identify patients at greater risk of severe symptoms based on their medical histories. The models developed here serve as a foundation for expanding the toolkit available to healthcare professionals and organizations. Additionally, explainable ML methods, such as Shapley Additive Explanations, aid in uncovering underlying data features that inform healthcare decision-making processes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist

Figures

Fig 1
Fig 1. ROC-AUC and ML calibration plot.
(A) ROC-AUC, area under the receiver operating characteristic curve. (B) Calibration plot of machine learning models.
Fig 2
Fig 2. Violin plot Shapley values.

Similar articles

Cited by

References

    1. Lauer SA, Grantz KH, Bi Q, Jones FK, Zheng Q, Meredith HR, et al.. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med. 2020. May 5;172(9): 577–582. doi: 10.7326/M20-0504 - DOI - PMC - PubMed
    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020. May;20(5): 533–534. doi: 10.1016/S1473-3099(20)30120-1 - DOI - PMC - PubMed
    1. Guan W, Ni Z, Hu Y, Liang W, Ou C, He J, et al.. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med. 2020. Apr 30;382(18): 1708–1720. doi: 10.1056/NEJMoa2002032 - DOI - PMC - PubMed
    1. Asri H, Mousannif H, Al Moatassime H, Noel T. Big data in healthcare: challenges and opportunities. 2015. [cited 2022 July 18]. In: 2015 International Conference on Cloud Technologies and Applications (CloudTech) [Internet]. Marrakech, Morocco: IEEE; [about 19 screens]. Available from: http://ieeexplore.ieee.org/document/7337020/.
    1. Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood). 2014. Jul;33(7): 1123–1131. doi: 10.1377/hlthaff.2014.0041 - DOI - PubMed