Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 6;22(10):e21439.
doi: 10.2196/21439.

Clinical Predictive Models for COVID-19: Systematic Study

Affiliations

Clinical Predictive Models for COVID-19: Systematic Study

Patrick Schwab et al. J Med Internet Res. .

Abstract

Background: COVID-19 is a rapidly emerging respiratory disease caused by SARS-CoV-2. Due to the rapid human-to-human transmission of SARS-CoV-2, many health care systems are at risk of exceeding their health care capacities, in particular in terms of SARS-CoV-2 tests, hospital and intensive care unit (ICU) beds, and mechanical ventilators. Predictive algorithms could potentially ease the strain on health care systems by identifying those who are most likely to receive a positive SARS-CoV-2 test, be hospitalized, or admitted to the ICU.

Objective: The aim of this study is to develop, study, and evaluate clinical predictive models that estimate, using machine learning and based on routinely collected clinical data, which patients are likely to receive a positive SARS-CoV-2 test or require hospitalization or intensive care.

Methods: Using a systematic approach to model development and optimization, we trained and compared various types of machine learning models, including logistic regression, neural networks, support vector machines, random forests, and gradient boosting. To evaluate the developed models, we performed a retrospective evaluation on demographic, clinical, and blood analysis data from a cohort of 5644 patients. In addition, we determined which clinical features were predictive to what degree for each of the aforementioned clinical tasks using causal explanations.

Results: Our experimental results indicate that our predictive models identified patients that test positive for SARS-CoV-2 a priori at a sensitivity of 75% (95% CI 67%-81%) and a specificity of 49% (95% CI 46%-51%), patients who are SARS-CoV-2 positive that require hospitalization with 0.92 area under the receiver operator characteristic curve (AUC; 95% CI 0.81-0.98), and patients who are SARS-CoV-2 positive that require critical care with 0.98 AUC (95% CI 0.95-1.00).

Conclusions: Our results indicate that predictive models trained on routinely collected clinical data could be used to predict clinical pathways for COVID-19 and, therefore, help inform care and prioritize resources.

Keywords: COVID-19; SARS-CoV-2; clinical data; clinical prediction; hospitalization; infectious disease; intensive care; machine learning; prediction; testing.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: PS is an employee and shareholder of F Hoffmann-La Roche Ltd.

Figures

Figure 1
Figure 1
We study the use of predictive models (light purple) to estimate whether patients are likely (i) to be SARS-CoV-2 positive and whether SARS-CoV-2 positive patients are likely (ii) to be admitted to the hospital and (iii) to require critical care based on clinical, demographic, and blood analysis data. Accurate clinical predictive models stratify patients according to individual risk and, in this manner, help prioritize health care resources such as testing, hospital, and critical care capacity.
Figure 2
Figure 2
The presented multistage machine learning pipeline consists of preprocessing (light purple) the input data x, developing multiple candidate models using the given data set (orange), selecting the best candidate model for evaluation (blue), and evaluating the selected best model's outputs ŷ.
Figure 3
Figure 3
A comparison of the top 10 features ranked by relative feature importance scores for the best-encountered model for predicting SARS-CoV-2 test results (gradient boosting, top), hospital admissions (random forest, middle), and critical care admission for patients who are SARS-CoV-2 positive (support vector machine, bottom), respectively. The bar length corresponds to the relative marginal importance (in %) of the displayed features toward the predictive performance of the respective model. Feature names that include “MISSING” indicate that the given marginal contribution refers to the importance of the presence of that feature's absence, not the feature itself.
Figure 4
Figure 4
Receiver operator characteristic curves for the best-encountered model for predicting SARS-CoV-2 test results (gradient boosting, left), hospital admissions for patients who are SARS-CoV-2 positive (random forest, top right), and critical care admissions for patients who are SARS-CoV-2 positive (support vector machine, bottom right). Numbers in the bottom right of each subgraph show the respective model's AUC. Solid dots on the curves indicate operating thresholds selected on the validation fold. AUC: Area under the receiver operator characteristic curve.

Similar articles

Cited by

References

    1. Coronavirus disease (COVID-19) pandemic. World Health Organization. 2020. [2020-05-01]. https://www.who.int/emergencies/diseases/novel-coronavirus-2019.
    1. Chinazzi M, Davis JT, Ajelli M, Gioannini C, Litvinova M, Merler S, Pastore Y Piontti A, Mu K, Rossi L, Sun K, Viboud C, Xiong X, Yu H, Halloran ME, Longini IM, Vespignani A. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science. 2020 Apr 24;368(6489):395–400. doi: 10.1126/science.aba9757. http://europepmc.org/abstract/MED/32144116 - DOI - PMC - PubMed
    1. Jernigan DB, CDC COVID-19 Response Team Update: public health response to the coronavirus disease 2019 outbreak - United States, February 24, 2020. MMWR Morb Mortal Wkly Rep. 2020 Feb 28;69(8):216–219. doi: 10.15585/mmwr.mm6908e1. - DOI - PMC - PubMed
    1. Lin Q, Zhao S, Gao D, Lou Y, Yang S, Musa SS, Wang MH, Cai Y, Wang W, Yang L, He D. A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action. Int J Infect Dis. 2020 Apr;93:211–216. doi: 10.1016/j.ijid.2020.02.058. https://linkinghub.elsevier.com/retrieve/pii/S1201-9712(20)30117-X - DOI - PMC - PubMed
    1. Wang CJ, Ng CY, Brook RH. Response to COVID-19 in Taiwan: big data analytics, new technology, and proactive testing. JAMA. 2020 Apr 14;323(14):1341–1342. doi: 10.1001/jama.2020.3151. - DOI - PubMed

MeSH terms