Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov:155:104594.
doi: 10.1016/j.ijmedinf.2021.104594. Epub 2021 Sep 23.

Predicting prognosis in COVID-19 patients using machine learning and readily available clinical data

Affiliations

Predicting prognosis in COVID-19 patients using machine learning and readily available clinical data

Thomas W Campbell et al. Int J Med Inform. 2021 Nov.

Abstract

Rationale: Prognostic tools for aiding in the treatment of hospitalized COVID-19 patients could help improve outcome by identifying patients at higher or lower risk of severe disease. The study objective was to develop models to stratify patients by risk of severe outcomes during COVID-19 hospitalization using readily available information at hospital admission.

Methods: Hierarchical ensemble classification models were trained on a set of 229 patients hospitalized with COVID-19 to predict severe outcomes, including ICU admission, development of acute respiratory distress syndrome, or intubation, using easily attainable attributes including basic patient characteristics, vital signs at admission, and basic lab results collected at time of presentation. Each test stratifies patients into groups of increasing risk. An additional cohort of 330 patients was used for blinded, independent validation. Shapley value analysis evaluated which attributes contributed most to the models' predictions of risk.

Main results: Test performance was assessed using precision (positive predictive value) and recall (sensitivity) of the final risk groups. All test cut-offs were fixed prior to blinded validation. In development and validation, the tests achieved precision in the lowest risk groups near or above 0.9. The proportion of patients with severe outcomes significantly increased across increasing risk groups. While the importance of attributes varied by test and patient, C-reactive protein, lactate dehydrogenase, and D-dimer were often found to be important in the assignment of risk.

Conclusions: Risk of severe outcomes for patients hospitalized with COVID-19 infection can be assessed using machine learning-based models based on attributes routinely collected at hospital admission.

Keywords: COVID-19; Clinical decision support systems; Machine learning; Prognostic models; Risk assessment.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: ‘The authors have no major conflicts of interest to disclose. TC, HR, RG, LM, and JR are named as inventors on a provisional patent assigned to Biodesix relevant to the work and hold stock and/or stock options in Biodesix. KE reports grant funding from the NIH during the conduct of the study and grant funding from Gilead and personal fees from Theratechnologies and ViiV outside of the conduct of the study’.

Figures

Fig. 1
Fig. 1
Consort diagrams of patient selection down to development (A) and validation (B) cohorts.
Fig. 2
Fig. 2
Hierarchical Configuration of Classifiers used for Risk Assessment for Each Endpoint. A Diagnostic Cortex model with in-bag decision tree model (represented by the top box) was used to stratify the entire development cohort into a higher and lower risk group for each endpoint. Diagnostic Cortex models (middle boxes) without trees were used to split the resulting two groups further according to one of the two schemas. (Schema A was used for the tests predicting risk of any complication and intubation. Schema B was used for the tests predicting risk of ARDS and admission to the ICU.)
Fig. 3
Fig. 3
Time from data collection to admission to the ICU for the 85 patients admitted to the ICU in the validation cohort indicating potential utility for ICU admission risk assessment at hospital admission. A time in the [0, 1] bin indicates the patient was admitted on the same day as the data was collected.
Fig. 4
Fig. 4
Performance Flow Chart for the Risk Assessment Test for ICU Admission for (A) the Development Cohort and (B) the Validation Cohort. Each uncolored box represents a classifier with the contents reflecting the set of patients to be classified by the classifier. The colored boxes represent the final risk groups with the contents reflecting composition of the groups and test performance. Bootstrap 95% confidence intervals for performance metrics are given in the supplement. Pos = Positive (Admitted to ICU); Neg = Negative (Not Admitted to ICU), PPV = Positive Predictive Value.
Fig. 5
Fig. 5
Performance Flow Chart for the Test Predicting Risk of Developing ARDS in (A) the Development Cohort, (B) the Validation Cohort. Each uncolored box represents a classifier with the contents reflecting the set of patients to be classified by the classifier. The colored boxes represent the final risk groups with the contents reflecting composition of the groups and test performance. Bootstrap 95% confidence intervals for performance metrics are given in the supplement. Pos = Positive (Developed ARDS); Neg = Negative (Did not develop ARDS), PPV = Positive Predictive Value.
Fig. 6
Fig. 6
Performance Flow Chart for the Test Assessing Risk of Intubation for (A) the Development Cohort and (B) the Validation Cohort. Each uncolored box represents a classifier with the contents reflecting the set of patients to be classified by the classifier. The colored boxes represent the final risk groups with the contents reflecting composition of the groups and test performance. Bootstrap 95% confidence intervals for performance metrics are given in the supplement. Pos = Positive (Intubated); Neg = Negative (Not intubated), PPV = Positive Predictive Value.

References

    1. Mahase E. Covid-19: FDA authorises neutralising antibody bamlanivimab for non-admitted patients. BMJ. 2020;11(371) doi: 10.1136/bmj.m4362. PMID: 33177042. - DOI - PubMed
    1. Elsawah H.K., Elsokary M.A., Abdallah M.S., ElShafie A.H. Efficacy and safety of remdesivir in hospitalized Covid-19 patients: Systematic review and meta-analysis including network meta-analysis. Rev. Med. Virol. 2020 doi: 10.1002/rmv.2187. Epub ahead of print. PMID: 33128490. - DOI - PubMed
    1. Tuccori M., Ferraro S., Convertino I., Cappello E., Valdiserra G., Blandizzi C., Maggi F., Focosi D. Anti-SARS-CoV-2 neutralizing monoclonal antibodies: clinical pipeline. MAbs. 2020;12(1):1854149. doi: 10.1080/19420862.2020.1854149. PMID: 33319649; PMCID: PMC7755170. - DOI - PMC - PubMed
    1. Zhou F., Yu T., Fan G., Liu Y., Liu Z., Xiang J., Wang Y., Song B., Gu X., Guan L., Wei Y., Li H., Wu X., Xu J., Tu S., Zhang Y., Chen H., Cao B. Clinical course and risk factors for mortality of adult inpatients with COVID-10 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395:1054–1062. - PMC - PubMed
    1. Ciceri F., Castagna A., Rovere-Querini P., De Cobelli F., Ruggeri A., Galli L., Conte C., De Lorenzo R., Poli A., Ambrosio A., Signorelli C., Bossi E., Fazio M., Tresoldi C., Colombo S., Monti G., Fominskiy E., Franchini S., Spessot M., Martinenghi C., Carlucci M., Beretta L., Scandroglio A.M., Clementi M., Locatelli M., Tresoldi M., Scarpellini P., Martino G., Bosi E., Dagna L., Lazzarin A., Landoni G., Zangrillo A. Early predictors of clinical outcomes of COVID-19 outbreak in Milan. Italy. Clin Immunol. 2020;217 - PMC - PubMed