Machine Learning Multimodal Model for Delirium Risk Stratification

Affiliations

¹ Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York.
² Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York.
³ Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, New York.
⁴ Nursing Administration, Mount Sinai Morningside Hospital, New York, New York.
⁵ Department of Nursing, The Mount Sinai Hospital, New York, New York.
⁶ Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York.
⁷ Department of Anesthesiology, Perioperative, and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, New York.

PMID: 40332938
PMCID: PMC12059973
DOI: 10.1001/jamanetworkopen.2025.8874

Machine Learning Multimodal Model for Delirium Risk Stratification

Joseph I Friedman et al. JAMA Netw Open. 2025.

. 2025 May 1;8(5):e258874.

doi: 10.1001/jamanetworkopen.2025.8874.

Authors

Affiliations

¹ Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York.
² Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York.
³ Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, New York.
⁴ Nursing Administration, Mount Sinai Morningside Hospital, New York, New York.
⁵ Department of Nursing, The Mount Sinai Hospital, New York, New York.
⁶ Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York.
⁷ Department of Anesthesiology, Perioperative, and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, New York.

PMID: 40332938
PMCID: PMC12059973
DOI: 10.1001/jamanetworkopen.2025.8874

Abstract

Importance: Automating the identification of risk for developing hospital delirium with models that use machine learning (ML) could facilitate more rapid prevention, identification, and treatment of delirium. However, there are very few reports on the performance of ML models for delirium risk stratification in live clinical practice.

Objective: To report on development, operationalization, and validation of a multimodal ML model for delirium risk stratification in live clinical practice and its associations with workflow and clinical outcomes.

Design, setting, and participants: This quality improvement study developed an ML model supported by automated electronic medical records to stratify the risk of non-intensive care unit delirium in live clinical practice using the Confusion Assessment Method as the diagnostic reference standard, with an iterative model update method. Data from patients aged at least 60 years admitted to non-intensive care units at Mount Sinai Hospital between January 2016 and January 2020 were used to train and test the ML model presented. The model was validated in live clinical practice from March 2023 to March 2024. Analysis of the model's associations with workflow and clinical outcomes was conducted retrospectively in 2024, comparing hospitalized patients prior to deployment of any model version (pre-ML cohort) and during model clinical deployment (post-ML cohort).

Main outcomes and measures: Outcomes of interest were area under the receiver operating characteristic curve, monthly delirium detection rates, median length of hospital stay, and daily doses of opiate, benzodiazepine, and antipsychotic medications administered.

Results: The overall sample included 32 284 inpatient admissions (mean [SD] age, 73.56 (9.67) years, 15 157 [46.9%] women). A total of 25 261 inpatient admissions of older patients with both medical and surgical primary diagnoses represented the combined model testing and training cohort (median age, 73.37 [66.42-81.36] years) and live clinical deployment validation cohort (median [IQR] age, 72.11 [62.26-78.97] years), while 7023 inpatient admissions of older patients with both medical and surgical primary diagnoses represented the combined pre-ML (median [IQR] age, 74.00 [68.00-81.00] years) and post-ML (median [IQR] age, 75.33 [68.34-82.91] years) cohorts. The model presented is a fusion of electronic medical record patient data features and clinical note features processed by natural language processing. The results of model validation in live clinical practice included an area under the curve of 0.94 (95% CI, 0.93-0.95). Median (IQR) monthly delirium detection rates of inpatients assessed for delirium with the Confusion Assessment Method increased from 4.42% (95% CI, 3.70%-5.14%) in the pre-ML cohort to 17.17% (95% CI, 15.54%-18.80%) in the post-ML cohort (P < .001). Post-ML vs pre-ML cohorts received lower daily doses of benzodiazepines (median [IQR] 0.93 [0.42-2.28] diazepam dose equivalents vs 1.60 [0.66-4.27] diazepam dose equivalents; P < .001) and olanzapine (median [IQR], 1.09 [0.38-2.46] mg vs 2.50 [1.17-6.65] mg; P < .001).

Conclusions and relevance: This quality improvement study demonstrates the feasibility of a novel multimodal ML model to automate delirium risk stratification in live clinical practice. The model demonstrated acceptable performance in live clinical practice and may facilitate resource allocation to enhance delirium identification and care.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: None reported.

Figures

**Figure 1.. Receiver Operating Characteristic Curves for the Fusion Multimodal With Natural Language Processing Model to Predict Delirium**
AUROC indicates area under the curve receiver operating characteristic curve; post-ML indicates the period after model deployment.

**Figure 2.. Comparison of Delirium Detection Rates**
Box-and-whisker plot comparing 5 summary statistics for the monthly delirium detection rates before any machine learning (ML) model deployment and following deployment of the multimodal with natural language processing ML-based delirium risk prediction model in live clinical practice. Delirium detection rates were calculated by dividing the number of positive Confusion Assessment Method (CAM) delirium screening results by the number of total CAM assessments each month. Whiskers indicate range; boxes, IQR; bold line, median.

See this image and copyright information in PMC

References

1. Siddiqi N, House AO, Holmes JD. Occurrence and outcome of delirium in medical in-patients: a systematic literature review. Age Ageing. 2006;35(4):350-364. doi:10.1093/ageing/afl005 - DOI - PubMed
1. Leslie DL, Zhang Y, Holford TR, Bogardus ST, Leo-Summers LS, Inouye SK. Premature death associated with delirium at 1-year follow-up. Arch Intern Med. 2005;165(14):1657-1662. doi:10.1001/archinte.165.14.1657 - DOI - PubMed
1. Davoudi A, Ebadi A, Rashidi P, Ozrazgat-Baslanti T, Bihorac A, Bursian AC. Delirium prediction using machine learning models on preoperative electronic health records data. Proc IEEE Int Symp Bioinformatics Bioeng. 2017;2017:568-573. - PMC - PubMed
1. Veeranki SPK, Hayn D, Kramer D, Jauk S, Schreier G. Effect of nursing assessment on predictive delirium models in hospitalised patients. Stud Health Technol Inform. 2018;248:124-131. doi:10.3233/978-1-61499-858-7-124 - DOI - PubMed
1. Corradi JP, Thompson S, Mather JF, Waszynski CM, Dicks RS. Prediction of incident delirium using a random forest classifier. J Med Syst. 2018;42(12):261. doi:10.1007/s10916-018-1109-0 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine Learning Multimodal Model for Delirium Risk Stratification

Affiliations

Machine Learning Multimodal Model for Delirium Risk Stratification

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical