Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 25:10:e69142.
doi: 10.2196/69142.

Toward a Clinically Actionable, Electronic Health Record-Based Machine Learning Model to Forecast 90-Day Change in Hemoglobin A1c in Youth With Type 1 Diabetes: Feasibility and Model Development Study

Affiliations

Toward a Clinically Actionable, Electronic Health Record-Based Machine Learning Model to Forecast 90-Day Change in Hemoglobin A1c in Youth With Type 1 Diabetes: Feasibility and Model Development Study

Erin M Tallon et al. JMIR Diabetes. .

Abstract

Background: Clinicians currently lack an effective means for identifying youth with type 1 diabetes (T1D) who are at risk for experiencing glycemic deterioration between diabetes clinic visits. As a result, their ability to identify youth who may optimally benefit from targeted interventions designed to address rising glycemic levels is limited. Although electronic health records (EHR)-based risk predictions have been used to forecast health outcomes in T1D, no study has investigated the potential for using EHR data to identify youth with T1D who will experience a clinically significant rise in glycated hemoglobin (HbA1c) ≥0.3% (approximately 3 mmol/mol) between diabetes clinic visits.

Objective: We aimed to evaluate the feasibility of using routinely collected EHR data to develop a machine learning model to predict 90-day unit-change in HbA1c (in % units) in youth (aged 9-18 y) with T1D. We assessed our model's ability to augment clinical decision-making by identifying a percent change cut point that optimized identification of youth who would experience a clinically significant rise in HbA1c.

Methods: From a cohort of 2757 youth with T1D who received care from a network of pediatric diabetes clinics in the Midwestern United States (January 2012-August 2017), we identified 1743 youth with 9643 HbA1c observation windows (ie, 2 HbA1c measurements separated by 70-110 d, approximating the 90-day time interval between routine diabetes clinic visits). We used up to 5 years of youths' longitudinal EHR data to transform 17,466 features (demographics, laboratory results, vital signs, anthropometric measures, medications, diagnosis codes, procedure codes, and free-text data) for model training. We performed 3-fold cross-validation to train random forest regression models to predict 90-day unit-change in HbA1c(%).

Results: Across all 3 folds of our cross-validation model, the average root-mean-square error was 0.88 (95% CI 0.85-0.90). Predicted HbA1c(%) strongly correlated with true HbA1c(%) (r=0.79; 95% CI 0.78-0.80). The top 10 features impacting model predictions included postal code, various metrics related to HbA1c, and the frequency of a diagnosis code indicating difficulty with treatment engagement. At a clinically significant percent rise threshold of ≥0.3% (approximately 3 mmol/mol), our model's positive predictive value was 60.3%, indicating a 1.5-fold enrichment (relative to the observed frequency that youth experienced this outcome [3928/9643, 40.7%]). Model sensitivity and positive predictive value improved when thresholds for clinical significance included smaller changes in HbA1c, whereas specificity and negative predictive value improved when thresholds required larger changes in HbA1c.

Conclusions: Routinely collected EHR data can be used to create an ML model for predicting unit-change in HbA1c between diabetes clinic visits among youth with T1D. Future work will focus on optimizing model performance and validating the model in additional cohorts and in other diabetes clinics.

Keywords: AI, artificial intelligence; EHR, electronic health records; HbA1c, hemoglobin A1c; T1D, type 1 diabetes; adolescent; clinical decision support; glycemic control; machine learning; pediatric; population health; prediction; real-world data; youth.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: CM and LD are employees of Blue Circle Health. RM is a consultant for Sanofi. ML has received research grants from Eli Lilly and Novo Nordisk and has been a consultant or has received honoraria from Astra Zeneca, Boehringer Ingelheim, Eli Lilly, Nordicinfu Care, Novo Nordisk, and Rubin Medical, all outside the submitted work. MAC is a consultant for Glooko, Inc. and receives research support from Dexcom and Abbott Diabetes Care. All other authors are responsible for the reported research and stated that they have no affiliation, financial agreement, or involvement with any company or other organization with a financial interest in the subject matter of the submitted manuscript.

Figures

Figure 1.
Figure 1.. Flowchart depicting inclusion and exclusion criteria for the study cohort and for glycated hemoglobin observation windows. Abbreviations: HbA1c: glycated hemoglobin; T1D: type 1 diabetes.
Figure 2.
Figure 2.. Distribution of the prediction error (ie, residuals) across all 3 cross-validation K-folds. Root-mean-square error is equal to the SD of the prediction error. RMSE: root-mean-square error.
Figure 3.
Figure 3.. Top 10 most important features for predicting 90-day percent change in glycated hemoglobin, assessed via gain-based feature importance. In random forest regression, gain is a feature importance measure that reflects, for a given feature, the mean increase in node purity (ie, mean reduction in variance) that the feature contributes across all splits in which it is used. Z91.19 is a diagnosis code from the ICD-10 (International Classification of Diseases, Tenth Revision), that is used to code for nonadherence to, or noncompliance with, medical treatment. Dx: diagnosis; HbA1c: hemoglobin A1c.

References

    1. DiMeglio LA, Evans-Molina C, Oram RA. Type 1 diabetes. Lancet. 2018 Jun 16;391(10138):2449–2462. doi: 10.1016/S0140-6736(18)31320-5. doi. Medline. - DOI - PMC - PubMed
    1. Fang M, Wang D, Selvin E. Prevalence of type 1 diabetes among US children and adults by age, sex, race, and ethnicity. JAMA. 2024 Apr 23;331(16):1411–1413. doi: 10.1001/jama.2024.2103. doi. Medline. - DOI - PMC - PubMed
    1. American Diabetes Association Professional Practice Committee 6. Glycemic goals and hypoglycemia: standards of care in diabetes—2024. Diabetes Care. 2024 Jan 1;47(Supplement_1):S111–S125. doi: 10.2337/dc24-S006. doi. - DOI - PMC - PubMed
    1. American Diabetes Association Professional Practice Committee 14. Children and adolescents: standards of care in diabetes—2024. Diabetes Care. 2024 Jan 1;47(Supplement_1):S258–S281. doi: 10.2337/dc24-S014. doi. - DOI - PMC - PubMed
    1. Patiño-Fernández AM, Eidson M, Sanchez J, Delamater AM. What do youth with type 1 diabetes know about the HbA1c test? Child Health Care. 2010 Apr 1;38(2):157–167. doi: 10.1080/02739610902813328. doi. Medline. - DOI - PMC - PubMed

LinkOut - more resources