Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

Lathan Liou¹, Erick Scott², Prathamesh Parchure³, Yuxia Ouyang^{4

5}, Natalia Egorova⁴, Robert Freeman³, Ira S Hofer^{5

6

7}, Girish N Nadkarni^{6

7}, Prem Timsina³, Arash Kia^{3

5}, Matthew A Levin^{3

5

6}

Affiliations

¹ Icahn School of Medicine at Mount Sinai, New York, NY, USA. lathan.liou@icahn.mssm.edu.
² cStructure, La Jolla, CA, USA.
³ Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁴ Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁵ Department of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁶ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁷ The Division of Data Driven and Digital Medicine (D3M), The Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 38844546
PMCID: PMC11156633
DOI: 10.1038/s41746-024-01141-5

Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

Lathan Liou et al. NPJ Digit Med. 2024.

. 2024 Jun 6;7(1):149.

doi: 10.1038/s41746-024-01141-5.

Authors

Affiliations

¹ Icahn School of Medicine at Mount Sinai, New York, NY, USA. lathan.liou@icahn.mssm.edu.
² cStructure, La Jolla, CA, USA.
³ Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁴ Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁵ Department of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁶ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁷ The Division of Data Driven and Digital Medicine (D3M), The Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 38844546
PMCID: PMC11156633
DOI: 10.1038/s41746-024-01141-5

Abstract

Malnutrition is a frequently underdiagnosed condition leading to increased morbidity, mortality, and healthcare costs. The Mount Sinai Health System (MSHS) deployed a machine learning model (MUST-Plus) to detect malnutrition upon hospital admission. However, in diverse patient groups, a poorly calibrated model may lead to misdiagnosis, exacerbating health care disparities. We explored the model's calibration across different variables and methods to improve calibration. Data from adult patients admitted to five MSHS hospitals from January 1, 2021 - December 31, 2022, were analyzed. We compared MUST-Plus prediction to the registered dietitian's formal assessment. Hierarchical calibration was assessed and compared between the recalibration sample (N = 49,562) of patients admitted between January 1, 2021 - December 31, 2022, and the hold-out sample (N = 17,278) of patients admitted between January 1, 2023 - September 30, 2023. Statistical differences in calibration metrics were tested using bootstrapping with replacement. Before recalibration, the overall model calibration intercept was -1.17 (95% CI: -1.20, -1.14), slope was 1.37 (95% CI: 1.34, 1.40), and Brier score was 0.26 (95% CI: 0.25, 0.26). Both weak and moderate measures of calibration were significantly different between White and Black patients and between male and female patients. Logistic recalibration significantly improved calibration of the model across race and gender in the hold-out sample. The original MUST-Plus model showed significant differences in calibration between White vs. Black patients. It also overestimated malnutrition in females compared to males. Logistic recalibration effectively reduced miscalibration across all patient subgroups. Continual monitoring and timely recalibration can improve model accuracy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Calibration Curves.**
Columns from left to right are curves for a, No Recalibration b, Recalibration-in-the-Large and c, Logistic Recalibration for Black vs. White patients d, No Recalibration e, Recalibration-in-the-Large and f, Logistic Recalibration for male vs. female patients.

See this image and copyright information in PMC

References

1. Nevin L. Advancing the beneficial use of machine learning in health care and medicine: Toward a community understanding. PLoS Med. 2018;15:e1002708. doi: 10.1371/journal.pmed.1002708. - DOI - PMC - PubMed
1. Parikh RB, Kakad M, Bates DW. Integrating predictive analytics into high-value care: the dawn of precision delivery. JAMA. 2016;315:651–652. doi: 10.1001/jama.2015.19417. - DOI - PubMed
1. Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. doi: 10.1186/s12916-019-1466-7. - DOI - PMC - PubMed
1. Wessler BS, et al. Tufts PACE Clinical Predictive Model Registry: update 1990 through 2015. Diagn. Progn. Res. 2017;1:20. doi: 10.1186/s41512-017-0021-2. - DOI - PMC - PubMed
1. Collins GS, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med. Res. Methodol. 2014;14:40. doi: 10.1186/1471-2288-14-40. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

Affiliations

Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources