Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2023 Oct;5(10):e657-e667.
doi: 10.1016/S2589-7500(23)00128-0. Epub 2023 Aug 18.

Illness severity assessment of older adults in critical illness using machine learning (ELDER-ICU): an international multicentre study with subgroup bias evaluation

Affiliations
Multicenter Study

Illness severity assessment of older adults in critical illness using machine learning (ELDER-ICU): an international multicentre study with subgroup bias evaluation

Xiaoli Liu et al. Lancet Digit Health. 2023 Oct.

Abstract

Background: Comorbidity, frailty, and decreased cognitive function lead to a higher risk of death in elderly patients (more than 65 years of age) during acute medical events. Early and accurate illness severity assessment can support appropriate decision making for clinicians caring for these patients. We aimed to develop ELDER-ICU, a machine learning model to assess the illness severity of older adults admitted to the intensive care unit (ICU) with cohort-specific calibration and evaluation for potential model bias.

Methods: In this retrospective, international multicentre study, the ELDER-ICU model was developed using data from 14 US hospitals, and validated in 171 hospitals from the USA and Netherlands. Data were extracted from the Medical Information Mart for Intensive Care database, electronic ICU Collaborative Research Database, and Amsterdam University Medical Centers Database. We used six categories of data as predictors, including demographics and comorbidities, physical frailty, laboratory tests, vital signs, treatments, and urine output. Patient data from the first day of ICU stay were used to predict in-hospital mortality. We used the eXtreme Gradient Boosting algorithm (XGBoost) to develop models and the SHapley Additive exPlanations method to explain model prediction. The trained model was calibrated before internal, external, and temporal validation. The final XGBoost model was compared against three other machine learning algorithms and five clinical scores. We performed subgroup analysis based on age, sex, and race. We assessed the discrimination and calibration of models using the area under receiver operating characteristic (AUROC) and standardised mortality ratio (SMR) with 95% CIs.

Findings: Using the development dataset (n=50 366) and predictive model building process, the XGBoost algorithm performed the best in all types of validations compared with other machine learning algorithms and clinical scores (internal validation with 5037 patients from 14 US hospitals, AUROC=0·866 [95% CI 0·851-0·880]; external validation in the US population with 20 541 patients from 169 hospitals, AUROC=0·838 [0·829-0·847]; external validation in European population with 2411 patients from one hospital, AUROC=0·833 [0·812-0·853]; temporal validation with 4311 patients from one hospital, AUROC=0·884 [0·869-0·897]). In the external validation set (US population), the median AUROCs of bias evaluations covering eight subgroups were above 0·81, and the overall SMR was 0·99 (0·96-1·03). The top ten risk predictors were the minimum Glasgow Coma Scale score, total urine output, average respiratory rate, mechanical ventilation use, best state of activity, Charlson Comorbidity Index score, geriatric nutritional risk index, code status, age, and maximum blood urea nitrogen. A simplified model containing only the top 20 features (ELDER-ICU-20) had similar predictive performance to the full model.

Interpretation: The ELDER-ICU model reliably predicts the risk of in-hospital mortality using routinely collected clinical features. The predictions could inform clinicians about patients who are at elevated risk of deterioration. Prospective validation of this model in clinical practice and a process for continuous performance monitoring and model recalibration are needed.

Funding: National Institutes of Health, National Natural Science Foundation of China, National Special Health Science Program, Health Science and Technology Plan of Zhejiang Province, Fundamental Research Funds for the Central Universities, Drug Clinical Evaluate Research of Chinese Pharmaceutical Association, and National Key R&D Program of China.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests We declare no competing interests.

Figures

Figure 1:
Figure 1:. Cohort flow diagram
(A) MIMIC. (B) eICU-CRD. (C) AmsterdamUMCdb. We combined the patients from 2001 to 2016 in the MIMIC dataset and patients from 13 hospitals in the eICU-CRD into a development set. The remaining 169 hospitals in eICU-CRD were combined as the external validation set (USA). The full cohort from the AmsterdamUMCdb was used as the external validation set (Europe). The remaining patients from MIMIC (2017–2019) were used as the temporal validation set. (D) Characteristics of our geriatric mortality prediction model and variable types of ELDER-ICU. BIDMC=Beth Israel Deaconess Medical Center. GCS=Glasgow Coma Scale. HR=heart rate. ICU=intensive care unit. MBP=mean blood pressure. RR=respiratory rate. SBP=systolic blood pressure. SpO2=oxygen saturation. T=temperature. UMC=University Medical Centers. *Development set. †Temporal validation. ‡Development set. §External validation in USA. ¶External validation in Europe.
Figure 2:
Figure 2:. Prediction performance: discrimination comparison of machine learning models and clinical scores in four types of validation
(A) Internal validation. (B) External validation (US). (C) External validation (Europe). (D) Temporal validation. APSIII=Acute Physiology Score III. AUROC=area under receiver operating characteristic. LR=logistic regression. NB=naive Bayes. OASIS=Oxford Acute Severity of Illness Score. RF=random forest. SAPS=Simplified Acute Physiology Score. SOFA=Sequential Organ Failure Assessment. XGBoost=eXtreme Gradient Boosting.
Figure 3:
Figure 3:. Bias evaluation based on the subgroup analysis of age, sex, and race
(A) Discrimination metric of AUROC. Dotted line denotes the mean AUROC of all patients in the external (US) validation set. (B) Calibration metric of SMR. Diamond indicates the point estimate and 95% CIs of the overall pooled effect from all studied subgroups. AUROC=area under receiver operating characteristic. SMR=standardised mortality ratio. O=observed. E=expected.
Figure 4:
Figure 4:. Explanation of the older mortality prediction model
(A), (B) Top 20 risk predictors for early prediction of geriatric (more than 65 years of age) mortality. Inference process of the model with (C) a non-surviving and (D) a surviving patient. AST=aspartate aminotransferase. BUN=blood urea nitrogen. CCI=Charlson Comorbidity Index. GCS=Glasgow Coma Scale. GNRI=Geriatric Nutritional Risk Index. ICU=intensive care unit. LOS=length of stay. MBP=mean blood pressure. SBP=systolic blood pressure. SHAP=SHapley Additive exPlanations. SpO2=oxygen saturation.

Comment in

References

    1. Leblanc G, Boumendil A, Guidet B. Ten things to know about critically ill elderly patients. Intensive Care Med 2017; 43: 217–19. - PubMed
    1. VIP2 study group. The contribution of frailty, cognition, activity of daily life and comorbidities on outcome in acutely admitted patients over 80 years in European ICUs: the VIP2 study. Intensive Care Med 2020; 46: 57–69. - PMC - PubMed
    1. Vallet H, Schwarz GL, Flaatten H, De Lange DW, Guidet B, Dechartres A. Mortality of older patients admitted to an ICU: a systematic review. Crit Care Med 2021; 49: 324–34. - PubMed
    1. Clegg A, Young J, Iliffe S, Rikkert MO, Rockwood K. Frailty in elderly people. Lancet 2013; 381: 752–62. - PMC - PubMed
    1. Sablerolles RS, Lafeber M, van Kempen JA, et al. Association between Clinical Frailty Scale score and hospital mortality in adult patients with COVID-19 (COMET): an international, multicentre, retrospective, observational cohort study. Lancet Healthy Longev 2021; 2: e163–70. - PMC - PubMed

Publication types

LinkOut - more resources