Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 2;16(7):e0252384.
doi: 10.1371/journal.pone.0252384. eCollection 2021.

A machine learning based exploration of COVID-19 mortality risk

Affiliations

A machine learning based exploration of COVID-19 mortality risk

Mahdi Mahdavi et al. PLoS One. .

Abstract

Early prediction of patient mortality risks during a pandemic can decrease mortality by assuring efficient resource allocation and treatment planning. This study aimed to develop and compare prognosis prediction machine learning models based on invasive laboratory and noninvasive clinical and demographic data from patients' day of admission. Three Support Vector Machine (SVM) models were developed and compared using invasive, non-invasive, and both groups. The results suggested that non-invasive features could provide mortality predictions that are similar to the invasive and roughly on par with the joint model. Feature inspection results from SVM-RFE and sparsity analysis displayed that, compared with the invasive model, the non-invasive model can provide better performances with a fewer number of features, pointing to the presence of high predictive information contents in several non-invasive features, including SPO2, age, and cardiovascular disorders. Furthermore, while the invasive model was able to provide better mortality predictions for the imminent future, non-invasive features displayed better performance for more distant expiration intervals. Early mortality prediction using non-invasive models can give us insights as to where and with whom to intervene. Combined with novel technologies, such as wireless wearable devices, these models can create powerful frameworks for various medical assignments and patient triage.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Illustration of the modeling framework.
Three machine learning models were developed using the SVM framework with three input groups; invasive, non-invasive, and their combination. The invasive group comprises laboratory results. Non-invasive features comprise patient clinical and demographic data. The joint group comprises the combination of invasive and non-invasive features. P1, P2, and P3 represent the prediction performance provided by the non-invasive, joint, and invasive models, respectively. The non-invasive model displayed good prediction performance in the farther future (P1) whereas the invasive model showed good prediction performance for the near future (P3). Neighborhood Component Analysis (NCA), recursive feature elimination via Support Vector Machine (SVM-RFE), and linear SVM with least absolute shrinkage and selection operator (Lasso) sparsity regularization (Sparse Linear SVM) were utilized for inspection of feature contributions and dynamics with respect to the outcome.
Fig 2
Fig 2. Contribution of demographic, clinical, and laboratory features to mortality prediction.
(A) The results of the regularized NCA analysis displays the contribution of single features to mortality prediction. Features are sorted based on contribution importance and category. Features with prominent weights were displayed by orange squares for visual convenience. (B) is a favorable feature space (PTT and age) where the information content of features with respect to the outcome is high, so many data points could be visually distinguished via an illustrative decision border. Panel (C), in contrast, demonstrates unfavorable feature space where the low information content of features has led to data points becoming crunched and hard to distinguish (Sex and Hgb). Panels B and C were created using half of the data and Principal Component Analysis (PCA) for illustrative purposes.
Fig 3
Fig 3. Comparison of mortality prediction of invasive and non-invasive models.
(A) ROC curve of joint, invasive, and non-invasive models. (B) Investigation of models’ performance and robustness towards sample size. For each data point, a model was trained and evaluated using 90% of data which was randomly bootstrapped from the main dataset while maintaining the original discharge to expired ratio. The models were robust to the sample size and no significant difference was observed between the performance of invasive and non-invasive models. (C) Performance table of invasive, non-invasive, and joint models. Performances are reported as mean along with standard deviations. (D) Comparing the dynamics of laboratory and non-invasive features for randomly selected combinations of features. (E) Recursive feature elimination. Compared with invasive features, prominent non-invasive features had significant prediction information contents. In general, the first three features with prominent contributions to the improvement of the non-invasive model’s performance were SPO2, age, and presence of cardiovascular disorders; the first three invasive features were BUN, LDH, and PTT. (F) Sparsity analysis. Sparse linear SVM was utilized to investigate optimal feature combinations for fixed predictor numbers. For a specific sparsity level (features number), the non-invasive model performs better than the invasive model. Green and gray represent non-invasive and invasive modes, respectively.
Fig 4
Fig 4. Temporal range model predictions.
(A) Temporal distribution of patient expiration intervals. The black vertical dashed line corresponds to the peak of the expiration distribution which was 3 days from admission. The gray vertical dashed line corresponds to the median expiration interval which was 7 days after admission. (B) and (C) Prediction performance of invasive and non-invasive models across expiration temporal spectrum. For panel (B), invasive and non-invasive models were trained over all the dataset. Afterwards, the expiration prediction performance was evaluated for 8 different expiration intervals. Days to outcome represents the number of days between patient admission and expiration. For panel (C), patient data were divided into three expiration intervals; from admission to day 3, from day 3 to day 7, and after day 7. For each interval, independent SVM models were trained and the true expiration ratio (True positive rate) was reported for each interval’s model. While invasive features were better predictors for imminent expiration, they were outperformed by non-invasive features over larger expiration intervals. Green and gray represent non-invasive and invasive modes, respectively.

Similar articles

Cited by

References

    1. WHO. Who coronavirus disease (covid-19) dashboard. URL https://covid19.who.int. Available at https://covid19.who.int. Accessed on 12.22.2020.
    1. Quah P., Li A. & Phua J. Mortality rates of patients with covid-19 in the intensive care unit: a systematic review of the emerging literature. Critical Care 24, 1–4 (2020). doi: 10.1186/s13054-019-2683-3 - DOI - PMC - PubMed
    1. Cascella M., Rajnik M., Cuomo A., Dulebohn S. C. & Di Napoli R. Features, evaluation and treatment coronavirus (covid-19). In Statpearls [internet] (StatPearls Publishing, 2020). - PubMed
    1. Shilo S., Rossman H. & Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nature Medicine 26, 29–38 (2020). doi: 10.1038/s41591-019-0727-5 - DOI - PubMed
    1. Yu K.-H., Beam A. L. & Kohane I. S. Artificial intelligence in healthcare. Nature biomedical engineering 2, 719–731 (2018). doi: 10.1038/s41551-018-0305-z - DOI - PubMed