. 2021 May 12:9:626697.

doi: 10.3389/fpubh.2021.626697. eCollection 2021.

Machine Learning Based Clinical Decision Support System for Early COVID-19 Mortality Prediction

Akshaya Karthikeyan¹, Akshit Garg¹, P K Vinod¹, U Deva Priyakumar¹

Affiliations

PMID: 34055710
PMCID: PMC8149622
DOI: 10.3389/fpubh.2021.626697

Machine Learning Based Clinical Decision Support System for Early COVID-19 Mortality Prediction

Akshaya Karthikeyan et al. Front Public Health. 2021.

. 2021 May 12:9:626697.

doi: 10.3389/fpubh.2021.626697. eCollection 2021.

Authors

Akshaya Karthikeyan¹, Akshit Garg¹, P K Vinod¹, U Deva Priyakumar¹

Affiliation

¹ Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India.

PMID: 34055710
PMCID: PMC8149622
DOI: 10.3389/fpubh.2021.626697

Abstract

The coronavirus disease 2019 (COVID-19), caused by the virus SARS-CoV-2, is an acute respiratory disease that has been classified as a pandemic by the World Health Organization (WHO). The sudden spike in the number of infections and high mortality rates have put immense pressure on the public healthcare systems. Hence, it is crucial to identify the key factors for mortality prediction to optimize patient treatment strategy. Different routine blood test results are widely available compared to other forms of data like X-rays, CT-scans, and ultrasounds for mortality prediction. This study proposes machine learning (ML) methods based on blood tests data to predict COVID-19 mortality risk. A powerful combination of five features: neutrophils, lymphocytes, lactate dehydrogenase (LDH), high-sensitivity C-reactive protein (hs-CRP), and age helps to predict mortality with 96% accuracy. Various ML models (neural networks, logistic regression, XGBoost, random forests, SVM, and decision trees) have been trained and performance compared to determine the model that achieves consistently high accuracy across the days that span the disease. The best performing method using XGBoost feature importance and neural network classification, predicts with an accuracy of 90% as early as 16 days before the outcome. Robust testing with three cases based on days to outcome confirms the strong predictive performance and practicality of the proposed model. A detailed analysis and identification of trends was performed using these key biomarkers to provide useful insights for intuitive application. This study provide solutions that would help accelerate the decision-making process in healthcare systems for focused medical treatments in an accurate, early, and reliable manner.

Keywords: biomarkers; coronavirus disease 2019; machine learning; mortality; prognosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Distribution of the two classes in the train and test sets after splitting.

**Figure 2**
Flowchart depicting the model development pipeline used in this study.

**Figure 3**
Architecture of the neural network implemented for feature selection, where n represents the number of features to be analyzed.

**Figure 4**
Comparison of the performance of different machine learning algorithms assessed using different metrics. The vertical lines denote the standard deviations. **(A)** Accuracy. **(B)** AUC and F1 score.

**Figure 5**
The performance of neural net on the test data using case 1: Number of days to outcome less than or equal to n. **(A)** The class-wise distribution of the cumulated data-points (≤ *nth* day) for all samples in the imputed test set. **(B)** Accuracy of the model evaluated for different days to outcome. **(C)** F1-score and AUC of the model evaluated for different days to outcome.

**Figure 6**
The performance of neural net on the test data using case 2: Number of days to outcome greater than or equal to n. **(A)** The class-wise distribution of the cumulated data-points (≤ *nth* day) for all samples in the imputed test set. **(B)** Accuracy of the model evaluated for different days to outcome. **(C)** F1-score and AUC of the model evaluated for different days to outcome.

**Figure 7**
The performance of neural net on the test data using case 3: Number of days to outcome equal to n. **(A)** The class-wise distribution of the cumulated data-points (≤ *nth* day) for all samples in the imputed test set. **(B)** Accuracy of the model evaluated for different days to outcome. **(C)** F1-score and AUC of the model evaluated for different days to outcome.

**Figure 8**
Box and whisker plot showing the variations of four selected features with respect to the days to outcome. **(A)** hs-CRP, **(B)** neutrophils (%), **(C)** lymphocyte (%), **(D)** lactate dehydrogenase.

See this image and copyright information in PMC

References

1. Zu ZY, Di Jiang M, Xu PP, Chen W, Ni QQ, Lu GM, et al. . Coronavirus disease 2019 (covid-19): a perspective from china. Radiology. (2020) 2020:200490. 10.1148/radiol.2020200490 - DOI - PMC - PubMed
1. Menni C, Valdes AM, Freidin MB, Sudre CH, Nguyen LH, Frew DA, et al. . Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat Med. (2020) 26:1037–40. 10.1038/s41591-020-0916-2 - DOI - PMC - PubMed
1. Callejon-Leblic MA, Moreno-Luna R, Del Cuvillo A, Reyes-Tejero IM, Garcia-Villaran MA, Santos-Peña M, et al. . Loss of smell and taste can accurately predict COVID-19 infection: a machine-learning approach. J Clin Med. (2021). 10:570. 10.3390/jcm10040570 - DOI - PMC - PubMed
1. Liu Y, Mao B, Liang S, Yang JW, Lu HW, Chai YH, et al. . Association between age and clinical characteristics and outcomes of COVID-19. Eur Respir J. (2020) 55:2001112. 10.1183/13993003.01112-2020 - DOI - PMC - PubMed
1. Pan A, Liu L, Wang C, Guo H, Hao X, Wang Q, et al. . Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan, China. JAMA. (2020) 323:1915–23. 10.1001/jama.2020.6130 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine Learning Based Clinical Decision Support System for Early COVID-19 Mortality Prediction

Affiliation

Machine Learning Based Clinical Decision Support System for Early COVID-19 Mortality Prediction

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Research Materials

Miscellaneous