Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 30;10(1):18716.
doi: 10.1038/s41598-020-75767-2.

Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study

Affiliations

Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study

Chansik An et al. Sci Rep. .

Abstract

The rapid spread of COVID-19 has resulted in the shortage of medical resources, which necessitates accurate prognosis prediction to triage patients effectively. This study used the nationwide cohort of South Korea to develop a machine learning model to predict prognosis based on sociodemographic and medical information. Of 10,237 COVID-19 patients, 228 (2.2%) died, 7772 (75.9%) recovered, and 2237 (21.9%) were still in isolation or being treated at the last follow-up (April 16, 2020). The Cox proportional hazards regression analysis revealed that age > 70, male sex, moderate or severe disability, the presence of symptoms, nursing home residence, and comorbidities of diabetes mellitus (DM), chronic lung disease, or asthma were significantly associated with increased risk of mortality (p ≤ 0.047). For machine learning, the least absolute shrinkage and selection operator (LASSO), linear support vector machine (SVM), SVM with radial basis function kernel, random forest (RF), and k-nearest neighbors were tested. In prediction of mortality, LASSO and linear SVM demonstrated high sensitivities (90.7% [95% confidence interval: 83.3, 97.3] and 92.0% [85.9, 98.1], respectively) and specificities (91.4% [90.3, 92.5] and 91.8%, [90.7, 92.9], respectively) while maintaining high specificities > 90%, as well as high area under the receiver operating characteristics curves (0.963 [0.946, 0.979] and 0.962 [0.945, 0.979], respectively). The most significant predictors for LASSO included old age and preexisting DM or cancer; for RF they were old age, infection route (cluster infection or infection from personal contact), and underlying hypertension. The proposed prediction model may be helpful for the quick triage of patients without having to wait for the results of additional tests such as laboratory or radiologic studies, during a pandemic when limited medical resources must be wisely allocated without hesitation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Histogram illustrating the distribution of the time interval between diagnosis and recovery (A) or mortality (B).
Figure 2
Figure 2
Box plot illustrating the time interval between diagnosis and recovery or mortality according to the age group.
Figure 3
Figure 3
Variable importance in prediction of mortality from COVID-19 by LASSO (A) and Random Forest (B).
Figure 4
Figure 4
Flow diagram for study participants.

Similar articles

Cited by

References

    1. World Health Organization. Coronavirus disease (COVID-19) pandemic.https://www.who.int/emergencies/diseases/novel-coronavirus-2019 (2020)
    1. Sun P, Lu X, Xu C, Sun W, Pan B. Understanding of COVID-19 based on current evidence. J. Med. Virol. 2020;92:548–551. doi: 10.1002/jmv.25722. - DOI - PMC - PubMed
    1. World Health Organization. Middle East respiratory syndrome coronavirus (MERS-CoV).https://www.who.int/emergencies/mers-cov (2020).
    1. World Health Organization. Cumulative Number of Reported Probable Cases of SARS.https://www.who.int/csr/sars/country/2003_07_11 (2020).
    1. Worldometer. COVID-19 Coronavirus Pandemichttps://www.worldometers.info/coronavirus (2020).

Publication types