Blood Glucose Prediction Algorithms Require Clinically Relevant Performance Criteria Beyond Accuracy
- PMID: 40300777
- DOI: 10.1089/dia.2025.0074
Blood Glucose Prediction Algorithms Require Clinically Relevant Performance Criteria Beyond Accuracy
Abstract
Background: The root mean squared error (RMSE) is commonly used to evaluate blood glucose prediction algorithms. However, it primarily measures how well predictions align with the most likely future values, rather than supporting optimal and proactive treatment decisions. Since diabetes management data predominantly features blood glucose values within the target range, RMSE tends to favor models that consistently predict target-range values, often at the expense of detecting clinically critical events such as rapid fluctuations, hypoglycemia, or hyperglycemia. This study examines how and why RMSE biases evaluations toward trivial models, highlighting the need for alternative performance criteria that better reflect clinical priorities. Methods: We developed the composite glucose prediction metric (CGPM) to integrate three components: RMSE, temporal gain and geometric mean (glycemic event prediction). A custom loss function was designed to emphasize clinically critical predictions during model training. Pareto frontier analysis was used to assess trade-offs among models with comparable performance. Results: CGPM was computed for five blood glucose prediction techniques (zero-order hold, naïve linear regression, ridge regression, ridge regression trained with a custom loss function, and a physiology-based model) applied to the OhioT1DM dataset. The data-driven model with the lowest RMSE performed poorly on glycemic event prediction, highlighting RMSE's bias toward target-range predictions. In contrast, the ridge regressor trained with the custom loss function improved event prediction, showing that clinically weighted optimization mitigates biases. Conclusions: Blood glucose prediction algorithms require evaluation and optimization criteria beyond accuracy to better support optimal treatment decisions. This study introduced the CGPM as an alternative evaluation framework, along with a loss function designed for model optimization that emphasizes clinically critical but rare events. Further clinical validation is needed to refine these criteria and ensure they align more closely with the needs of diabetes management.
Keywords: Pareto frontier analysis; blood glucose prediction; evaluation metric; loss function; machine learning; predictive modeling.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
