Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system
- PMID: 38844546
- PMCID: PMC11156633
- DOI: 10.1038/s41746-024-01141-5
Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system
Abstract
Malnutrition is a frequently underdiagnosed condition leading to increased morbidity, mortality, and healthcare costs. The Mount Sinai Health System (MSHS) deployed a machine learning model (MUST-Plus) to detect malnutrition upon hospital admission. However, in diverse patient groups, a poorly calibrated model may lead to misdiagnosis, exacerbating health care disparities. We explored the model's calibration across different variables and methods to improve calibration. Data from adult patients admitted to five MSHS hospitals from January 1, 2021 - December 31, 2022, were analyzed. We compared MUST-Plus prediction to the registered dietitian's formal assessment. Hierarchical calibration was assessed and compared between the recalibration sample (N = 49,562) of patients admitted between January 1, 2021 - December 31, 2022, and the hold-out sample (N = 17,278) of patients admitted between January 1, 2023 - September 30, 2023. Statistical differences in calibration metrics were tested using bootstrapping with replacement. Before recalibration, the overall model calibration intercept was -1.17 (95% CI: -1.20, -1.14), slope was 1.37 (95% CI: 1.34, 1.40), and Brier score was 0.26 (95% CI: 0.25, 0.26). Both weak and moderate measures of calibration were significantly different between White and Black patients and between male and female patients. Logistic recalibration significantly improved calibration of the model across race and gender in the hold-out sample. The original MUST-Plus model showed significant differences in calibration between White vs. Black patients. It also overestimated malnutrition in females compared to males. Logistic recalibration effectively reduced miscalibration across all patient subgroups. Continual monitoring and timely recalibration can improve model accuracy.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures
Similar articles
-
International Validation of the SORG Machine-learning Algorithm for Predicting the Survival of Patients with Extremity Metastases Undergoing Surgical Treatment.Clin Orthop Relat Res. 2022 Feb 1;480(2):367-378. doi: 10.1097/CORR.0000000000001969. Clin Orthop Relat Res. 2022. PMID: 34491920 Free PMC article.
-
How Does the Skeletal Oncology Research Group Algorithm's Prediction of 5-year Survival in Patients with Chondrosarcoma Perform on International Validation?Clin Orthop Relat Res. 2020 Oct;478(10):2300-2308. doi: 10.1097/CORR.0000000000001305. Clin Orthop Relat Res. 2020. PMID: 32433107 Free PMC article.
-
Does the SORG Algorithm Predict 5-year Survival in Patients with Chondrosarcoma? An External Validation.Clin Orthop Relat Res. 2019 Oct;477(10):2296-2303. doi: 10.1097/CORR.0000000000000748. Clin Orthop Relat Res. 2019. PMID: 31107338 Free PMC article.
-
A discussion of calibration techniques for evaluating binary and categorical predictive models.Prev Vet Med. 2018 Jan 1;149:107-114. doi: 10.1016/j.prevetmed.2017.11.018. Epub 2017 Nov 24. Prev Vet Med. 2018. PMID: 29290291 Review.
-
Methodological guidance for the evaluation and updating of clinical prediction models: a systematic review.BMC Med Res Methodol. 2022 Dec 12;22(1):316. doi: 10.1186/s12874-022-01801-8. BMC Med Res Methodol. 2022. PMID: 36510134 Free PMC article.
Cited by
-
The urgency of centering safety-net organizations in AI governance.NPJ Digit Med. 2025 Feb 21;8(1):117. doi: 10.1038/s41746-025-01479-4. NPJ Digit Med. 2025. PMID: 39984650 Free PMC article. Review.
-
Determinants of depressive symptoms in multinational middle-aged and older adults.NPJ Digit Med. 2025 Aug 4;8(1):501. doi: 10.1038/s41746-025-01905-7. NPJ Digit Med. 2025. PMID: 40759736 Free PMC article.
-
A scoping review and evidence gap analysis of clinical AI fairness.NPJ Digit Med. 2025 Jun 14;8(1):360. doi: 10.1038/s41746-025-01667-2. NPJ Digit Med. 2025. PMID: 40517148 Free PMC article.
References
LinkOut - more resources
Full Text Sources