Tensor learning of pointwise mutual information from EHR data for early prediction of sepsis
- PMID: 33991856
- DOI: 10.1016/j.compbiomed.2021.104430
Tensor learning of pointwise mutual information from EHR data for early prediction of sepsis
Abstract
Early detection of sepsis can facilitate early clinical intervention with effective treatment and may reduce sepsis mortality rates. In view of this, machine learning-based automated diagnosis of sepsis using easily recordable physiological data can be more promising as compared to the gold standard rule-based clinical criteria in current practice. This study aims to develop such a machine learning framework that demonstrates the quantification of heterogeneity within the tabular electronic health records (EHR) data of clinical covariates to capture both linear relationships and nonlinear correlation for the early prediction of sepsis. Here, the statistics of pairwise association for each hour-covariate pair within the EHR data for every 6-hours window-duration with selected 24 covariates is described using pointwise mutual information (PMI) matrix. This matrix gives the heterogeneity of data as a two-dimensional map. Such matrices are fused horizontally along the z-axis as vertical slices in the xy plane to form a 3-way tensor for each record with the corresponding Length of Stay (L). Tensor factorization of such fused tensor for every record is performed using Tucker decomposition, and only the core tensors are retained later, excluding the 3 unitary matrices to provide the latent feature set for the prediction of sepsis onset. A five-fold cross-validation scheme is employed wherein the obtained 120 latent features from the reshaped core tensor, are fed to Light Gradient Boosting Machine Learning models (LightGBM) for binary classification, further alleviating the involved class imbalance. The machine-learning framework is designed via Bayesian optimization, yielding an average normalized utility score of 0.4519 as defined by challenge organizers and area under the receiver operating characteristic curve (AUROC) of 0.8621 on publicly available PhysioNet/Computing in Cardiology Challenge 2019 training data. The proposed tensor decomposition of 3-way fused tensor formulated using PMI matrices leverages higher-order temporal interactions between the pairwise associations among the clinical values for early prediction of sepsis. This is validated with improved risk prediction power for every hour of admission to the ICU in terms of utility score, AUROC, and F1 score. The results obtained show a significant improvement particularly in terms of utility score of ~1.5-2% under a 5-fold cross-validation scheme on entire training data as compared to a top entrant research study that participated in the challenge.
Keywords: Early prediction; Electronic health records; Machine learning; Medical informatics; Model-based diagnosis; Pointwise mutual information; Sepsis; Tensor factorization.
Copyright © 2021 Elsevier Ltd. All rights reserved.
Similar articles
-
Automated diagnosis of coronary artery disease using scalogram-based tensor decomposition with heart rate signals.Med Eng Phys. 2022 Dec;110:103811. doi: 10.1016/j.medengphy.2022.103811. Epub 2022 Apr 27. Med Eng Phys. 2022. PMID: 35525698
-
Early Sepsis Prediction Using Ensemble Learning With Deep Features and Artificial Features Extracted From Clinical Electronic Health Records.Crit Care Med. 2020 Dec;48(12):e1337-e1342. doi: 10.1097/CCM.0000000000004644. Crit Care Med. 2020. PMID: 33044286
-
Early Prediction of Sepsis From Clinical Data Using Ratio and Power-Based Features.Crit Care Med. 2020 Dec;48(12):e1343-e1349. doi: 10.1097/CCM.0000000000004691. Crit Care Med. 2020. PMID: 33048903
-
Early Prediction of Sepsis in the ICU Using Machine Learning: A Systematic Review.Front Med (Lausanne). 2021 May 28;8:607952. doi: 10.3389/fmed.2021.607952. eCollection 2021. Front Med (Lausanne). 2021. PMID: 34124082 Free PMC article.
-
Predictive Modeling Using Artificial Intelligence and Machine Learning Algorithms on Electronic Health Record Data: Advantages and Challenges.Crit Care Clin. 2023 Oct;39(4):647-673. doi: 10.1016/j.ccc.2023.02.001. Epub 2023 Apr 26. Crit Care Clin. 2023. PMID: 37704332 Review.
Cited by
-
A methodological systematic review of validation and performance of sepsis real-time prediction models.NPJ Digit Med. 2025 Apr 7;8(1):190. doi: 10.1038/s41746-025-01587-1. NPJ Digit Med. 2025. PMID: 40189694 Free PMC article.
-
Exploring Public Discussions Regarding COVID-19 Vaccinations on Microblogs in China: Findings from Machine Learning Algorithms.Int J Environ Res Public Health. 2022 Oct 18;19(20):13476. doi: 10.3390/ijerph192013476. Int J Environ Res Public Health. 2022. PMID: 36294061 Free PMC article.
-
Impact of tooth loss and patient characteristics on coronary artery calcium score classification and prediction.Sci Rep. 2024 Nov 16;14(1):28315. doi: 10.1038/s41598-024-79900-3. Sci Rep. 2024. PMID: 39550443 Free PMC article.
-
Coronary heart disease prediction method fusing domain-adaptive transfer learning with graph convolutional networks (GCN).Sci Rep. 2023 Aug 31;13(1):14276. doi: 10.1038/s41598-023-33124-z. Sci Rep. 2023. PMID: 37652917 Free PMC article.
-
Predicting sepsis in-hospital mortality with machine learning: a multi-center study using clinical and inflammatory biomarkers.Eur J Med Res. 2024 Mar 6;29(1):156. doi: 10.1186/s40001-024-01756-0. Eur J Med Res. 2024. PMID: 38448999 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical