Preserving Informative Presence: How Missing Data and Imputation Strategies Affect the Performance of an AI-Based Early Warning Score
- PMID: 40217663
- PMCID: PMC11989256
- DOI: 10.3390/jcm14072213
Preserving Informative Presence: How Missing Data and Imputation Strategies Affect the Performance of an AI-Based Early Warning Score
Abstract
Background/Objectives: Data availability can affect the performance of AI-based early warning scores (EWSs). This study evaluated how the extent of missing data and imputation strategies influence the predictive performance of the VitalCare-Major Adverse Event Score (VC-MAES), an AI-based EWS that uses last observation carried forward and normal-value imputation for missing values, to forecast clinical deterioration events, including unplanned ICU transfers, cardiac arrests, or death, up to 6 h in advance. Methods: We analyzed real-world data from 6039 patient encounters at Keimyung University Dongsan Hospital, Republic of Korea. Performance was evaluated under three scenarios: (1) using only vital signs and age, treating all other variables as missing; (2) reintroducing a full set of real-world clinical variables; and (3) imputing missing values drawn from a distribution within one standard deviation of the observed mean or using Multiple Imputation by Chained Equations (MICE). Results: VC-MAES achieved the area under the receiver operating characteristic curve (AUROC) of 0.896 using only vital signs and age, outperforming traditional EWSs, including the National Early Warning Score (0.797) and the Modified Early Warning Score (0.722). Reintroducing full clinical variables improved the AUROC to 0.918, whereas mean-based imputation or MICE decreased the performance to 0.885 and 0.827, respectively. Conclusions: VC-MAES demonstrates robust predictive performance with limited inputs, outperforming traditional EWSs. Incorporating actual clinical data significantly improved accuracy. In contrast, mean-based or MICE imputation yielded poorer results than the default normal-value imputation, potentially due to disregarding the "informative presence" embedded in missing data patterns. These findings underscore the importance of understanding missingness patterns and employing imputation strategies that consider the decision-making context behind data availability to enhance model reliability.
Keywords: artificial intelligence; early warning score; imputation; modified early warning score; national early warning score.
Conflict of interest statement
Authors Taeyong Sim, Sangchul Hahn, Kwang-Joon Kim, Eun-Young Cho, Yeeun Jeong, Ji-hyun Kim and Ki-Byung Lee were employed by the company AITRICS. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures


Similar articles
-
Prospective external validation of a deep-learning-based early-warning system for major adverse events in general wards in South Korea.Acute Crit Care. 2025 May;40(2):197-208. doi: 10.4266/acc.000525. Epub 2025 May 30. Acute Crit Care. 2025. PMID: 40494595 Free PMC article.
-
Demonstrating the consequences of learning missingness patterns in early warning systems for preventative health care: A novel simulation and solution.J Biomed Inform. 2020 Oct;110:103528. doi: 10.1016/j.jbi.2020.103528. Epub 2020 Aug 11. J Biomed Inform. 2020. PMID: 32795506
-
The performance of prognostic models depended on the choice of missing value imputation algorithm: a simulation study.J Clin Epidemiol. 2024 Dec;176:111539. doi: 10.1016/j.jclinepi.2024.111539. Epub 2024 Sep 24. J Clin Epidemiol. 2024. PMID: 39326470
-
Improving In-Hospital Patient Rescue: What Are Studies on Early Warning Scores Missing? A Scoping Review.Crit Care Explor. 2022 Feb 21;4(2):e0644. doi: 10.1097/CCE.0000000000000644. eCollection 2022 Feb. Crit Care Explor. 2022. PMID: 35224506 Free PMC article.
-
The optimal threshold for prompt clinical review: An external validation study of the national early warning score.J Clin Nurs. 2020 Dec;29(23-24):4594-4603. doi: 10.1111/jocn.15493. Epub 2020 Oct 11. J Clin Nurs. 2020. PMID: 32920891 Review.
Cited by
-
Clinical Context Is More Important than Data Quantity to the Performance of an Artificial Intelligence-Based Early Warning System.J Clin Med. 2025 Jun 23;14(13):4444. doi: 10.3390/jcm14134444. J Clin Med. 2025. PMID: 40648818 Free PMC article.
-
Deep Learning-Based Early Warning Systems in Hospitalized Patients at Risk of Code Blue Events and Length of Stay: Retrospective Real-World Implementation Study.JMIR Med Inform. 2025 Aug 22;13:e72232. doi: 10.2196/72232. JMIR Med Inform. 2025. PMID: 40845828 Free PMC article.
References
LinkOut - more resources
Full Text Sources