A machine learning based exploration of COVID-19 mortality risk

Mahdi Mahdavi^{1

2}, Hadi Choubdar^{1

2}, Erfan Zabeh³, Michael Rieder^{4

5

6

7}, Safieddin Safavi-Naeini⁸, Zsolt Jobbagy⁹, Amirata Ghorbani¹⁰, Atefeh Abedini¹¹, Arda Kiani¹², Vida Khanlarzadeh⁶, Reza Lashgari¹, Ehsan Kamrani^{4

8}

Affiliations

¹ Institute of Medical Science and Technology (IMSAT), Shahid Beheshti University, Tehran, Iran.
² Department of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
³ Department of Biomedical Engineering, Columbia University, New York, NY, United States of America.
⁴ Robarts Research Institute, University of Western Ontario, London, ON, Canada.
⁵ Department of Paediatrics, Children's Hospital of Western Ontario, London, ON, Canada.
⁶ Department of Medicine, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada.
⁷ CIHR-GSK Chair in Pediatric Clinical Pharmacology, Children's Health Research Institute, London, ON, Canada.
⁸ CIARS (Centre for Intelligent Antenna and Radio Systems), Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada.
⁹ Department of Pathology, Immunology and Molecular Pathology, Rutgers New Jersey Medical School, Newark, NJ, United States of America.
¹⁰ Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America.
¹¹ Chronic Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.
¹² Tracheal Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.

PMID: 34214101
PMCID: PMC8253432
DOI: 10.1371/journal.pone.0252384

A machine learning based exploration of COVID-19 mortality risk

Mahdi Mahdavi et al. PLoS One. 2021.

. 2021 Jul 2;16(7):e0252384.

doi: 10.1371/journal.pone.0252384. eCollection 2021.

Authors

Affiliations

¹ Institute of Medical Science and Technology (IMSAT), Shahid Beheshti University, Tehran, Iran.
² Department of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
³ Department of Biomedical Engineering, Columbia University, New York, NY, United States of America.
⁴ Robarts Research Institute, University of Western Ontario, London, ON, Canada.
⁵ Department of Paediatrics, Children's Hospital of Western Ontario, London, ON, Canada.
⁶ Department of Medicine, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada.
⁷ CIHR-GSK Chair in Pediatric Clinical Pharmacology, Children's Health Research Institute, London, ON, Canada.
⁸ CIARS (Centre for Intelligent Antenna and Radio Systems), Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada.
⁹ Department of Pathology, Immunology and Molecular Pathology, Rutgers New Jersey Medical School, Newark, NJ, United States of America.
¹⁰ Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America.
¹¹ Chronic Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.
¹² Tracheal Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.

PMID: 34214101
PMCID: PMC8253432
DOI: 10.1371/journal.pone.0252384

Abstract

Early prediction of patient mortality risks during a pandemic can decrease mortality by assuring efficient resource allocation and treatment planning. This study aimed to develop and compare prognosis prediction machine learning models based on invasive laboratory and noninvasive clinical and demographic data from patients' day of admission. Three Support Vector Machine (SVM) models were developed and compared using invasive, non-invasive, and both groups. The results suggested that non-invasive features could provide mortality predictions that are similar to the invasive and roughly on par with the joint model. Feature inspection results from SVM-RFE and sparsity analysis displayed that, compared with the invasive model, the non-invasive model can provide better performances with a fewer number of features, pointing to the presence of high predictive information contents in several non-invasive features, including SPO2, age, and cardiovascular disorders. Furthermore, while the invasive model was able to provide better mortality predictions for the imminent future, non-invasive features displayed better performance for more distant expiration intervals. Early mortality prediction using non-invasive models can give us insights as to where and with whom to intervene. Combined with novel technologies, such as wireless wearable devices, these models can create powerful frameworks for various medical assignments and patient triage.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Illustration of the modeling framework.**
Three machine learning models were developed using the SVM framework with three input groups; invasive, non-invasive, and their combination. The invasive group comprises laboratory results. Non-invasive features comprise patient clinical and demographic data. The joint group comprises the combination of invasive and non-invasive features. P1, P2, and P3 represent the prediction performance provided by the non-invasive, joint, and invasive models, respectively. The non-invasive model displayed good prediction performance in the farther future (P1) whereas the invasive model showed good prediction performance for the near future (P3). Neighborhood Component Analysis (NCA), recursive feature elimination via Support Vector Machine (SVM-RFE), and linear SVM with least absolute shrinkage and selection operator (Lasso) sparsity regularization (Sparse Linear SVM) were utilized for inspection of feature contributions and dynamics with respect to the outcome.

**Fig 2. Contribution of demographic, clinical, and laboratory features to mortality prediction.**
(A) The results of the regularized NCA analysis displays the contribution of single features to mortality prediction. Features are sorted based on contribution importance and category. Features with prominent weights were displayed by orange squares for visual convenience. (B) is a favorable feature space (PTT and age) where the information content of features with respect to the outcome is high, so many data points could be visually distinguished via an illustrative decision border. Panel (C), in contrast, demonstrates unfavorable feature space where the low information content of features has led to data points becoming crunched and hard to distinguish (Sex and Hgb). Panels B and C were created using half of the data and Principal Component Analysis (PCA) for illustrative purposes.

**Fig 3. Comparison of mortality prediction of invasive and non-invasive models.**
(A) ROC curve of joint, invasive, and non-invasive models. (B) Investigation of models’ performance and robustness towards sample size. For each data point, a model was trained and evaluated using 90% of data which was randomly bootstrapped from the main dataset while maintaining the original discharge to expired ratio. The models were robust to the sample size and no significant difference was observed between the performance of invasive and non-invasive models. (C) Performance table of invasive, non-invasive, and joint models. Performances are reported as mean along with standard deviations. (D) Comparing the dynamics of laboratory and non-invasive features for randomly selected combinations of features. (E) Recursive feature elimination. Compared with invasive features, prominent non-invasive features had significant prediction information contents. In general, the first three features with prominent contributions to the improvement of the non-invasive model’s performance were SPO₂, age, and presence of cardiovascular disorders; the first three invasive features were BUN, LDH, and PTT. (F) Sparsity analysis. Sparse linear SVM was utilized to investigate optimal feature combinations for fixed predictor numbers. For a specific sparsity level (features number), the non-invasive model performs better than the invasive model. Green and gray represent non-invasive and invasive modes, respectively.

**Fig 4. Temporal range model predictions.**
(A) Temporal distribution of patient expiration intervals. The black vertical dashed line corresponds to the peak of the expiration distribution which was 3 days from admission. The gray vertical dashed line corresponds to the median expiration interval which was 7 days after admission. (B) and (C) Prediction performance of invasive and non-invasive models across expiration temporal spectrum. For panel (B), invasive and non-invasive models were trained over all the dataset. Afterwards, the expiration prediction performance was evaluated for 8 different expiration intervals. Days to outcome represents the number of days between patient admission and expiration. For panel (C), patient data were divided into three expiration intervals; from admission to day 3, from day 3 to day 7, and after day 7. For each interval, independent SVM models were trained and the true expiration ratio (True positive rate) was reported for each interval’s model. While invasive features were better predictors for imminent expiration, they were outperformed by non-invasive features over larger expiration intervals. Green and gray represent non-invasive and invasive modes, respectively.

See this image and copyright information in PMC

Cited by

Detection of COVID-19 Patients Using Machine Learning Techniques: A Nationwide Chilean Study.
Ormeño P, Márquez G, Guerrero-Nancuante C, Taramasco C. Ormeño P, et al. Int J Environ Res Public Health. 2022 Jun 30;19(13):8058. doi: 10.3390/ijerph19138058. Int J Environ Res Public Health. 2022. PMID: 35805713 Free PMC article.
Artificial intelligence approach towards assessment of condition of COVID-19 patients - Identification of predictive biomarkers associated with severity of clinical condition and disease progression.
Blagojević A, Šušteršič T, Lorencin I, Šegota SB, Anđelić N, Milovanović D, Baskić D, Baskić D, Petrović NZ, Sazdanović P, Car Z, Filipović N. Blagojević A, et al. Comput Biol Med. 2021 Nov;138:104869. doi: 10.1016/j.compbiomed.2021.104869. Epub 2021 Sep 14. Comput Biol Med. 2021. PMID: 34547582 Free PMC article.
AD-CovNet: An exploratory analysis using a hybrid deep learning model to handle data imbalance, predict fatality, and risk factors in Alzheimer's patients with COVID-19.
Akter S, Das D, Haque RU, Quadery Tonmoy MI, Hasan MR, Mahjabeen S, Ahmed M. Akter S, et al. Comput Biol Med. 2022 Jul;146:105657. doi: 10.1016/j.compbiomed.2022.105657. Epub 2022 May 22. Comput Biol Med. 2022. PMID: 35672170 Free PMC article.
Online COVID-19 diagnosis prediction using complete blood count: an innovative tool for public health.
Teng X, Wang Z. Teng X, et al. BMC Public Health. 2023 Dec 19;23(1):2536. doi: 10.1186/s12889-023-17477-8. BMC Public Health. 2023. PMID: 38114942 Free PMC article.
A composite ranking of risk factors for COVID-19 time-to-event data from a Turkish cohort.
Ulgen A, Cetin S, Cetin M, Sivgin H, Li W. Ulgen A, et al. Comput Biol Chem. 2022 Jun;98:107681. doi: 10.1016/j.compbiolchem.2022.107681. Epub 2022 Apr 9. Comput Biol Chem. 2022. PMID: 35487152 Free PMC article.

See all "Cited by" articles

References

1. WHO. Who coronavirus disease (covid-19) dashboard. URL https://covid19.who.int. Available at https://covid19.who.int. Accessed on 12.22.2020.
1. Quah P., Li A. & Phua J. Mortality rates of patients with covid-19 in the intensive care unit: a systematic review of the emerging literature. Critical Care 24, 1–4 (2020). doi: 10.1186/s13054-019-2683-3 - DOI - PMC - PubMed
1. Cascella M., Rajnik M., Cuomo A., Dulebohn S. C. & Di Napoli R. Features, evaluation and treatment coronavirus (covid-19). In Statpearls [internet] (StatPearls Publishing, 2020). - PubMed
1. Shilo S., Rossman H. & Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nature Medicine 26, 29–38 (2020). doi: 10.1038/s41591-019-0727-5 - DOI - PubMed
1. Yu K.-H., Beam A. L. & Kohane I. S. Artificial intelligence in healthcare. Nature biomedical engineering 2, 719–731 (2018). doi: 10.1038/s41551-018-0305-z - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A machine learning based exploration of COVID-19 mortality risk

Affiliations

A machine learning based exploration of COVID-19 mortality risk

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical