Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 1;8(5):e258874.
doi: 10.1001/jamanetworkopen.2025.8874.

Machine Learning Multimodal Model for Delirium Risk Stratification

Affiliations

Machine Learning Multimodal Model for Delirium Risk Stratification

Joseph I Friedman et al. JAMA Netw Open. .

Abstract

Importance: Automating the identification of risk for developing hospital delirium with models that use machine learning (ML) could facilitate more rapid prevention, identification, and treatment of delirium. However, there are very few reports on the performance of ML models for delirium risk stratification in live clinical practice.

Objective: To report on development, operationalization, and validation of a multimodal ML model for delirium risk stratification in live clinical practice and its associations with workflow and clinical outcomes.

Design, setting, and participants: This quality improvement study developed an ML model supported by automated electronic medical records to stratify the risk of non-intensive care unit delirium in live clinical practice using the Confusion Assessment Method as the diagnostic reference standard, with an iterative model update method. Data from patients aged at least 60 years admitted to non-intensive care units at Mount Sinai Hospital between January 2016 and January 2020 were used to train and test the ML model presented. The model was validated in live clinical practice from March 2023 to March 2024. Analysis of the model's associations with workflow and clinical outcomes was conducted retrospectively in 2024, comparing hospitalized patients prior to deployment of any model version (pre-ML cohort) and during model clinical deployment (post-ML cohort).

Main outcomes and measures: Outcomes of interest were area under the receiver operating characteristic curve, monthly delirium detection rates, median length of hospital stay, and daily doses of opiate, benzodiazepine, and antipsychotic medications administered.

Results: The overall sample included 32 284 inpatient admissions (mean [SD] age, 73.56 (9.67) years, 15 157 [46.9%] women). A total of 25 261 inpatient admissions of older patients with both medical and surgical primary diagnoses represented the combined model testing and training cohort (median age, 73.37 [66.42-81.36] years) and live clinical deployment validation cohort (median [IQR] age, 72.11 [62.26-78.97] years), while 7023 inpatient admissions of older patients with both medical and surgical primary diagnoses represented the combined pre-ML (median [IQR] age, 74.00 [68.00-81.00] years) and post-ML (median [IQR] age, 75.33 [68.34-82.91] years) cohorts. The model presented is a fusion of electronic medical record patient data features and clinical note features processed by natural language processing. The results of model validation in live clinical practice included an area under the curve of 0.94 (95% CI, 0.93-0.95). Median (IQR) monthly delirium detection rates of inpatients assessed for delirium with the Confusion Assessment Method increased from 4.42% (95% CI, 3.70%-5.14%) in the pre-ML cohort to 17.17% (95% CI, 15.54%-18.80%) in the post-ML cohort (P < .001). Post-ML vs pre-ML cohorts received lower daily doses of benzodiazepines (median [IQR] 0.93 [0.42-2.28] diazepam dose equivalents vs 1.60 [0.66-4.27] diazepam dose equivalents; P < .001) and olanzapine (median [IQR], 1.09 [0.38-2.46] mg vs 2.50 [1.17-6.65] mg; P < .001).

Conclusions and relevance: This quality improvement study demonstrates the feasibility of a novel multimodal ML model to automate delirium risk stratification in live clinical practice. The model demonstrated acceptable performance in live clinical practice and may facilitate resource allocation to enhance delirium identification and care.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: None reported.

Figures

Figure 1.
Figure 1.. Receiver Operating Characteristic Curves for the Fusion Multimodal With Natural Language Processing Model to Predict Delirium
AUROC indicates area under the curve receiver operating characteristic curve; post-ML indicates the period after model deployment.
Figure 2.
Figure 2.. Comparison of Delirium Detection Rates
Box-and-whisker plot comparing 5 summary statistics for the monthly delirium detection rates before any machine learning (ML) model deployment and following deployment of the multimodal with natural language processing ML-based delirium risk prediction model in live clinical practice. Delirium detection rates were calculated by dividing the number of positive Confusion Assessment Method (CAM) delirium screening results by the number of total CAM assessments each month. Whiskers indicate range; boxes, IQR; bold line, median.

Similar articles

References

    1. Siddiqi N, House AO, Holmes JD. Occurrence and outcome of delirium in medical in-patients: a systematic literature review. Age Ageing. 2006;35(4):350-364. doi:10.1093/ageing/afl005 - DOI - PubMed
    1. Leslie DL, Zhang Y, Holford TR, Bogardus ST, Leo-Summers LS, Inouye SK. Premature death associated with delirium at 1-year follow-up. Arch Intern Med. 2005;165(14):1657-1662. doi:10.1001/archinte.165.14.1657 - DOI - PubMed
    1. Davoudi A, Ebadi A, Rashidi P, Ozrazgat-Baslanti T, Bihorac A, Bursian AC. Delirium prediction using machine learning models on preoperative electronic health records data. Proc IEEE Int Symp Bioinformatics Bioeng. 2017;2017:568-573. - PMC - PubMed
    1. Veeranki SPK, Hayn D, Kramer D, Jauk S, Schreier G. Effect of nursing assessment on predictive delirium models in hospitalised patients. Stud Health Technol Inform. 2018;248:124-131. doi:10.3233/978-1-61499-858-7-124 - DOI - PubMed
    1. Corradi JP, Thompson S, Mather JF, Waszynski CM, Dicks RS. Prediction of incident delirium using a random forest classifier. J Med Syst. 2018;42(12):261. doi:10.1007/s10916-018-1109-0 - DOI - PubMed