. 2017 Apr 27;13(4):e1005232.

doi: 10.1371/journal.pcbi.1005232. eCollection 2017 Apr.

Personalized glucose forecasting for type 2 diabetes using data assimilation

David J Albers¹, Matthew Levine¹, Bruce Gluckman², Henry Ginsberg³, George Hripcsak¹, Lena Mamykina¹

Affiliations

¹ Department of Biomedical Informatics, Columbia University, New York, New York, United States of America.
² Departments of Engineering Sciences and Mechanics, Neurosurgery, and Biomedical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States of America.
³ Department of Medicine, Columbia University, New York, New York, United States of America.

PMID: 28448498
PMCID: PMC5409456
DOI: 10.1371/journal.pcbi.1005232

Personalized glucose forecasting for type 2 diabetes using data assimilation

David J Albers et al. PLoS Comput Biol. 2017.

. 2017 Apr 27;13(4):e1005232.

doi: 10.1371/journal.pcbi.1005232. eCollection 2017 Apr.

Authors

David J Albers¹, Matthew Levine¹, Bruce Gluckman², Henry Ginsberg³, George Hripcsak¹, Lena Mamykina¹

Affiliations

¹ Department of Biomedical Informatics, Columbia University, New York, New York, United States of America.
² Departments of Engineering Sciences and Mechanics, Neurosurgery, and Biomedical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States of America.
³ Department of Medicine, Columbia University, New York, New York, United States of America.

PMID: 28448498
PMCID: PMC5409456
DOI: 10.1371/journal.pcbi.1005232

Erratum in

Correction: Personalized glucose forecasting for type 2 diabetes using data assimilation.
Albers DJ, Levine M, Gluckman B, Ginsberg H, Hripcsak G, Mamykina L. Albers DJ, et al. PLoS Comput Biol. 2021 Aug 20;17(8):e1009325. doi: 10.1371/journal.pcbi.1009325. eCollection 2021 Aug. PLoS Comput Biol. 2021. PMID: 34415908 Free PMC article.

Abstract

Type 2 diabetes leads to premature death and reduced quality of life for 8% of Americans. Nutrition management is critical to maintaining glycemic control, yet it is difficult to achieve due to the high individual differences in glycemic response to nutrition. Anticipating glycemic impact of different meals can be challenging not only for individuals with diabetes, but also for expert diabetes educators. Personalized computational models that can accurately forecast an impact of a given meal on an individual's blood glucose levels can serve as the engine for a new generation of decision support tools for individuals with diabetes. However, to be useful in practice, these computational engines need to generate accurate forecasts based on limited datasets consistent with typical self-monitoring practices of individuals with type 2 diabetes. This paper uses three forecasting machines: (i) data assimilation, a technique borrowed from atmospheric physics and engineering that uses Bayesian modeling to infuse data with human knowledge represented in a mechanistic model, to generate real-time, personalized, adaptable glucose forecasts; (ii) model averaging of data assimilation output; and (iii) dynamical Gaussian process model regression. The proposed data assimilation machine, the primary focus of the paper, uses a modified dual unscented Kalman filter to estimate states and parameters, personalizing the mechanistic models. Model selection is used to make a personalized model selection for the individual and their measurement characteristics. The data assimilation forecasts are empirically evaluated against actual postprandial glucose measurements captured by individuals with type 2 diabetes, and against predictions generated by experienced diabetes educators after reviewing a set of historical nutritional records and glucose measurements for the same individual. The evaluation suggests that the data assimilation forecasts compare well with specific glucose measurements and match or exceed in accuracy expert forecasts. We conclude by examining ways to present predictions as forecast-derived range quantities and evaluate the comparative advantages of these ranges.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Glucose measurements, forecasts, corrections, and continuous forecasts of both models for days 1–5 (left) and 20–25 (right) for participant 1.**
These representations of the tracking and forecasting processes performed by the UKF provide an intuitive understanding of the estimation procedure, as well as for the dynamics of each model. Forecast errors are generally larger for both models during early stages of training (days 1–5), but while both models produce similarly reasonable forecasts of each discrete measurement (which we establish quantitatively), both models predict qualitatively *different* off-data continuous forecasts, represented by the lines, over the same time periods and subject to the same measurements; this difference demonstrates that the models represent different physiology.

**Fig 2. Mean squared error *integrated* over time after day 7 for participant 1 for both ultradian and meal models.**
Notice: both models show convergence in mean squared error over time; after 200 measurements the integrated mean squared error for both models does not differentiate them; and as the number of measurements increase, the majority of the errors are below the integrated mean squared error value.

**Fig 3**
**Top row**—carbohydrate consumption in grams averaged every day for participants 1 and 2; participant 1 reported making dietary changes during the course of the study, with a particular focus on reducing the amount of consumed carbohydrates, while participant 2 did not report any changes in their diet during the study. **Bottom row**—HbA1c measurements for participants 1 and 2, measured-glucose-based HbA1c estimate, and the associated continuous DA generated χ_o-based *daily* HbA1c smoothed estimates for both the meal and ultradian models; the χ_o-based HbA1c forecasts are accurate *and track broad glucose trends* in all cases and *the ultradian model HbA1c forecast for participant 1 predicted both the trend and the final measured HbA1c value*. The initial underestimate of participant 1’s HbA1c is due to a combination of two factors: (i) the initial HbA1c measurement is taken at a time where the model error is high because it is early in the sequential estimation process and (ii) the model training set does not include much data prior to the first HbA1c measurement.

Fig 4. Convergence in time, i.e., *personalization* of the DA, for three parameters of ultradian model (left) and three *different* parameters for the meal model over the course of measurements for all five patients.
Recall that the models have few overlapping parameters so it is not possible to compare parameters between models. Notice: (i) each model converges to a different state, depending on the patients, personalizing to the individual, (ii) visually, initial convergence of the parameters requires about 50 data points, (iii) patient 1, who continuously changed their behavior, had parameters that evolved in time in contrast to patient 2 whose behavior did not evolve and whose parameters did not change appreciably after 50 data points, and (iv) when using the meal model with patient 1, the potential for underfitting due to the sigma-point constraint in the parameter filter can be observed at measurement 50—recall that the sigma-point constraints were added to ensure robustness of the UKF, and that there exist methods for rectifying such issues, which we do not address.

**Fig 5. The kernel density estimates of the probability densities (PDFs) of glucose measurements and a variety of forecasts after day 7 for participant 1 for both ultradian and meal models.**
These PDFs are used to estimate the KL-divergence between the kernel density estimate of measured and model forecasted glucose values. For this patient the PDF generated by the meal model appears to more closely resemble the PDF generated by the measured glucose than the PDF generated by the ultradian model. Importantly, the PDFs of the ultradian and meal models are distinct and different from one another, implying differing physiologic processes and mechanisms present.

**Fig 6. Mean square error, KL-divergence, and linear correlation for the entire set of forecasts for each participant was plotted with respect to ultradian model weight.**
Boxes represent the points at which each metric is optimized. Note that most optima exist between 0.5–0.8. Both mean square error and linear correlation have single inflection points, whereas KL-divergence takes on many local minima. The presence of optimal weights that are far from both 0 and 1 indicate the potential value of model averaging in this context.

**Fig 7. The progression of model averaging performance is shown for P1, along with the weights selected by each method at each forecasting step.**
MSE-based and LC-based averaging show immediate improvements in their respective quantities, and are able to strike a balance between the high linear correlation of the meal model and the low MSE of the ultradian model. For P1, only the MSE-based average improved long-term KL divergence. In addition, the MSE-based average achieved better error than the ultradian model while nearly doubling the correlation of its forecasts with real measurements. This pattern becomes evident within the first 75 forecasts, demonstrating the feasibility and utility of doing model selection across model averaging modalities.

**Fig 8. Practical forecast accuracy of DA forecasts can be captured by calculating the percentage of measurements that are contained within glucose forecast-derived forecast ranges.**
Here we consider three such ranges, standard deviation, variance, and range of the glucose forecast 30–120 minutes after a given meal. The *left plot* shows the last 25 glucose measurements of participant 1 and a moving window of the ultradian model-based off-data forecast ranges. From this we can see how the windowed forecast ranges predict the glucose value: the variance is too wide to be useful but contains all measurements, and the range encapsulates proportionally more measurements as the *more narrow standard deviation*. The *right plot* shows the percentage of post meal measurements captures versus with width of the boundary window.

See this image and copyright information in PMC

References

1. Law K, Stuart A, Zygalakis K. Data assimilation. Springer; 2015.
1. Jazwinski A. Stochastic processes and Filtering Theory. Dover; 1998.
1. Lorenc A. Analysis methods for numerical weather prediction. Q J R Meterol Soc. 1988;112:1177–1194. 10.1002/qj.49711247414 - DOI
1. Daley R. Atmospheric data analysis. Cambridge University Press; 1991.
1. Ristic B, Arulampalam S, Gordon N. Beyond the Kalman filter: particle filters for tracking and applications. Artech house; 2004.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Personalized glucose forecasting for type 2 diabetes using data assimilation

Affiliations

Personalized glucose forecasting for type 2 diabetes using data assimilation

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Other Literature Sources

Medical