Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 17;17(3):e0264785.
doi: 10.1371/journal.pone.0264785. eCollection 2022.

COVID-19 Risk Stratification and Mortality Prediction in Hospitalized Indian Patients: Harnessing clinical data for public health benefits

Affiliations

COVID-19 Risk Stratification and Mortality Prediction in Hospitalized Indian Patients: Harnessing clinical data for public health benefits

Shanmukh Alle et al. PLoS One. .

Abstract

The variability of clinical course and prognosis of COVID-19 highlights the necessity of patient sub-group risk stratification based on clinical data. In this study, clinical data from a cohort of Indian COVID-19 hospitalized patients is used to develop risk stratification and mortality prediction models. We analyzed a set of 70 clinical parameters including physiological and hematological for developing machine learning models to identify biomarkers. We also compared the Indian and Wuhan cohort, and analyzed the role of steroids. A bootstrap averaged ensemble of Bayesian networks was also learned to construct an explainable model for discovering actionable influences on mortality and days to outcome. We discovered blood parameters, diabetes, co-morbidity and SpO2 levels as important risk stratification features, whereas mortality prediction is dependent only on blood parameters. XGboost and logistic regression model yielded the best performance on risk stratification and mortality prediction, respectively (AUC score 0.83, AUC score 0.92). Blood coagulation parameters (ferritin, D-Dimer and INR), immune and inflammation parameters IL6, LDH and Neutrophil (%) are common features for both risk and mortality prediction. Compared with Wuhan patients, Indian patients with extreme blood parameters indicated higher survival rate. Analyses of medications suggest that a higher proportion of survivors and mild patients who were administered steroids had extreme neutrophil and lymphocyte percentages. The ensemble averaged Bayesian network structure revealed serum ferritin to be the most important predictor for mortality and Vitamin D to influence severity independent of days to outcome. The findings are important for effective triage during strains on healthcare infrastructure.

PubMed Disclaimer

Conflict of interest statement

Authors, Ramanathan Sethuraman, C. Subramanian, Mashrin Srivastava, Avinash Chakravarthi, Johnny Jacob, Madhuri Namagiri, and Varma Konala are employed by the company, Intel Technology India Private Limited, Bangalore, India. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Machine learning pipeline for the development of the risk stratification and mortality prediction.
Fig 2
Fig 2. Clinical sub-phenotypes and the co-morbidity feature diversity.
(A) Clinical sub-phenotype diversity of the COVID-19 patients. The patients are grouped into Recovered and Dead. Each circle represents individual patients; the color of the circle indicates the severity of the patients whereas the size of the circle represents duration of hospital stay. The numbers on the circle represents the duration in ICU. (B) Presence of different co-morbid conditions in mild and severe patients. It represents a comparative view of the co-morbidities, patients with mild severity are represented by blue color, while ones with severe COVID-19 infection are represented by orange.
Fig 3
Fig 3
(A) Confusion matrix of neural net trained on Wuhan data and tested on Indian data. This was done by comparing actual and predicted mortality of patients in the dataset. (B) Comparison of the normalized histogram plots of important features useful for predicting mortality from Wuhan and Indian Cohorts. It shows the comparative distribution of clinical parameters between death and survival cases. (C) Pair-wise distances between distributions of important features across the Indian vs. Wuhan survived and dead classes. Distance values were calculated through Kolmogorov–Smirnov test.
Fig 4
Fig 4
(A) Comparison of F1 scores for various machine learning models that use patient vitals and lab test results. (B) Performance of the ML models with respect to number of days to outcome.
Fig 5
Fig 5. Distribution plots for lymphocyte (%) and neutrophil (%) in steroid administered and non-administered patients having mild and severe disease.
Fig 6
Fig 6. Explainable AI model to discover and quantify actionable factors.
A zoomed in portion (A) of the complete structure, (B) learned as directed acyclic graph revealed the key factors for Mortality and Days to outcome. Each node is a variable, and the edges represent direction of probabilistic influence learned from data. In the Indian dataset, model inference revealed that Serum Ferritin was the most important predictor of Mortality. Further, high levels of 25-hydroxy vitamin D delayed the Days to outcome independent of Severity Class, thus indicating a potential protective effect despite the outcome being primarily determined by severity. The explainable framework is proposed to be used for reasoning and decision-making in the Indian settings. Here we take two examples of outcomes of interest, i.e. mortality and days to mortality. The change in percentage probability of the outcome in a certain interval (e.g. high mortality or lower number of days to death) was inferred conditioned upon the learned associations in the network. S7 Table shows the inferences using the Exact Inference algorithm on the learned structure, which quantify the key influences.
Fig 7
Fig 7
Biomarker variations in different patient classes in due course of disease progression by (A) risk (B) mortality parameters. Showing consistent separation of biomarker levels in mortality prediction parameters and a decrease in separation of risk prediction parameters.
Fig 8
Fig 8. Model-based analysis of clinical data for risk stratification with potential clinical implementation.
Machine learning model-based analysis of clinical data variables to identify parameters for risk stratification, and development of dashboard for implementation of the model at the clinical site.

Similar articles

Cited by

References

    1. Zu ZY, Jiang MD, Xu PP, Chen W, Ni QQ, Lu GM, et al.. Coronavirus Disease 2019 (COVID-19): A Perspective from China. Radiology. 2020. Aug;296(2):E15–E25. doi: 10.1148/radiol.2020200490 - DOI - PMC - PubMed
    1. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al.. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020. Feb 15;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5 - DOI - PMC - PubMed
    1. Sanche S, Lin YT, Xu C, Romero-Severson E, Hengartner N, Ke R. High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerging Infect Dis. 2020. Jul;26(7):1470–1477. doi: 10.3201/eid2607.200282 - DOI - PMC - PubMed
    1. Ponsford MJ, Gkatzionis A, Walker VM, Grant AJ, Wootton RE, Moore LSP, et al.. Cardiometabolic Traits, Sepsis, and Severe COVID-19: A Mendelian Randomization Investigation. Circulation. 2020. Nov 3;142(18):1791–1793. doi: 10.1161/CIRCULATIONAHA.120.050753 - DOI - PMC - PubMed
    1. Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al.. Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China. JAMA. 2020. Mar 17;323(11):1061–1069. doi: 10.1001/jama.2020.1585 - DOI - PMC - PubMed

Publication types