Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 23;49(6):1918-1929.
doi: 10.1093/ije/dyaa171.

Early prediction of mortality risk among patients with severe COVID-19, using machine learning

Affiliations

Early prediction of mortality risk among patients with severe COVID-19, using machine learning

Chuanyu Hu et al. Int J Epidemiol. .

Abstract

Background: Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 infection, has been spreading globally. We aimed to develop a clinical model to predict the outcome of patients with severe COVID-19 infection early.

Methods: Demographic, clinical and first laboratory findings after admission of 183 patients with severe COVID-19 infection (115 survivors and 68 non-survivors from the Sino-French New City Branch of Tongji Hospital, Wuhan) were used to develop the predictive models. Machine learning approaches were used to select the features and predict the patients' outcomes. The area under the receiver operating characteristic curve (AUROC) was applied to compare the models' performance. A total of 64 with severe COVID-19 infection from the Optical Valley Branch of Tongji Hospital, Wuhan, were used to externally validate the final predictive model.

Results: The baseline characteristics and laboratory tests were significantly different between the survivors and non-survivors. Four variables (age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level) were selected by all five models. Given the similar performance among the models, the logistic regression model was selected as the final predictive model because of its simplicity and interpretability. The AUROCs of the external validation sets were 0.881. The sensitivity and specificity were 0.839 and 0.794 for the validation set, when using a probability of death of 50% as the cutoff. Risk score based on the selected variables can be used to assess the mortality risk. The predictive model is available at [https://phenomics.fudan.edu.cn/risk_scores/].

Conclusions: Age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level of COVID-19 patients at admission are informative for the patients' outcomes.

Keywords: COVID-19; death; fatality rate; machine learning; predictive model.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The study flow chart. bagFDA, bagged flexible discriminant analysis; PLS, partial least squares; AUROC, area under the receiver operating characteristic curve
Figure 2
Figure 2
The top 20 important variables selected by five machine learning models (A-E) and the model performance based on the selected variables (F). NT-proBNP, N-terminal pro-brain natriuretic peptide; hsCRP, high-sensitivity C-reactive protein; CRE, creatinine; ALT, alanine aminotransferase; IL8, interleukin 8; LYM, lymphocyte count; IL6, interleukin 6; WC, white cell count; eGFR, estimated glomerular filtration rate; ALP, alkaline phosphatase; RC, red cell count; TCH, total cholesterol; FIB, fibrinogen; PLT, platelet count; TB, total bilirubin; ALB, albumin; PT, prothrombin time; IL10, interleukin 10; IL2R, interleukin-2 receptor; FER, ferritin; LDH, lactate dehydrogenase; hscTnI, high-sensitivity cardiac troponin I; Mono, monocyte count; UA, uric acid; ESR, erythrocyte sedimentation rate; No.Com., number of basic conditions
Figure 3
Figure 3
The area under the receiver operating characteristic curve (AUROC) of the logistic regression model based on selected variables in the derivation set (A) and the external validation set (B)
Figure 4
Figure 4
The relationship between probability of death and the normalized probability (A) and the cost curves of the predictive models using different cutoffs (B)
Figure 5
Figure 5
The formula to calculate the risk scores (A) and their distributions among survivors and non-survivors (B) and the corresponding probability of death (C)

References

    1. World Health Organization. Coronavirus Disease (COVID-19) Pandemic. 2020. https://www.who.int/emergencies/diseases/novel-coronavirus-2019 (8 June 2020, date last accessed).
    1. Rajgor DD, Lee MH, Archuleta S, Bagdasarian N, Quek SC. The many estimates of the COVID-19 case fatality rate. Lancet Infect Dis 2020;20:776–7. - PMC - PubMed
    1. Zhou F, Yu T, Du R. et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 2020;395:1054–62. - PMC - PubMed
    1. Chen T, Wu D, Chen H. et al. Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study. BMJ 2020;368:m1091. - PMC - PubMed
    1. Yang X, Yu Y, Xu J. et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med 2020;8:475–81. - PMC - PubMed

Publication types