Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 30:17:2013-2025.
doi: 10.2147/NSS.S512262. eCollection 2025.

Explainable Machine Learning Assists in Revealing Associations Between Polysomnographic Biomarkers and Incident Type 2 Diabetes in Men

Affiliations

Explainable Machine Learning Assists in Revealing Associations Between Polysomnographic Biomarkers and Incident Type 2 Diabetes in Men

Duc Phuc Nguyen et al. Nat Sci Sleep. .

Abstract

Introduction: Type 2 diabetes (T2D) shows bidirectional relationships with polysomnographic measures. However, no studies have searched systematically for novel polysomnographic biomarkers of T2D. We therefore investigated if state-of-the-art explainable machine learning (ML) models could identify new polysomnographic biomarkers predictive of incident T2D.

Methods: We applied explainable ML models to longitudinal cohort study data from 536 males who were free of T2D at baseline and identified 52 cases of T2D at follow-up (mean 8.3, range 3.5-10.5 years). Beyond ranking biomarker importance, we explored how the explainable ML model approach can identify novel relationships, assist in hypothesis testing, and provide insights into risk factors.

Results: The top five most predictive biomarkers included waist circumference, glucose, and three novel sleep biomarkers: the number of 3% desaturations in non-supine sleep, mean heart rate in supine sleep, and mean hypopnea duration. Explainable machine learning identified a significant association between the number of non-supine desaturation events (threshold of 19 events) and incident T2D (Odds ratio = 2.4 [95% CI 1.2-4.8], P = 0.013). No significant associations were found using continuous or quartiled versions of non-supine desaturation. Additionally, the model provided an individualized risk factor breakdown, supporting a more personalized approach to precision sleep medicine.

Conclusion: Explainable ML supports the role of established biomarkers and reveals novel biomarkers of T2D likely to help guide further hypothesis testing and validation of more robust and clinically useful biomarkers. Although further validation is needed, these proof-of-concept data support the benefits of explainable ML in prospective data analysis.

Keywords: explainable machine learning; obstructive sleep apnoea; polysomnographic biomarkers; type 2 diabetes.

PubMed Disclaimer

Conflict of interest statement

Professor Peter Catcheside reports grants from National Health and Medical Research Council, grants from Defence Science and Technology Group, Compumedics Ltd, Invicta Medical, Garnett Passe and Rodney Williams Memorial Foundation, MND Australia, American Academy of Sleep Medicine, Lifetime Support Authority, Flinders Foundation, and a patent US-20210327584-A1 with royalties paid to Flinders University. Dr Bastien Lechat reports grants from Withings. Dr Andrew Vakulin reports grants from National Health and Medical Council of Australia (NHMRC), Philips Respironics, ResMed, ResMed Foundation, Lifetime Support Authority, Medical Research Future Fund (MRFF), and a patent PCT/AU2019/051147 Decision Support Software System for Sleep Disorder Identification licensed to Philips Respironics. Professor Robert Adams reports grants from National Health and medical Research Council, The Hospital Research Foundation, National Heart Foundation, ResMed Foundation, Philips Respironics, and Australian Government. The other authors have no competing interests to disclose.The abstract of this paper was presented at the 2024 Australasian Sleep Association conference as a poster presentation with interim findings. The poster’s abstract was published in Poster Abstracts in Sleep Advances: https://doi.org/10.1093/sleepadvances/zpae070.142.

Figures

Figure 1
Figure 1
Schematic demonstrating data-driven approaches for exploring the association between sleep biomarkers and type 2 diabetes. (a) Flow diagram of the study and follow-up assessments. (b) Biomarkers and features from different domains including sleep, breathing disorders, and demographics were extracted to input into the XGBoost machine learning model. (c) The trained model was passed through the AI Explainer framework for estimating the relative importance of biomarkers based on Shapley values. (d) The least important biomarker was removed from the data, and this process repeated until the top 15 most important biomarkers were found (d and e). (e) The output of the explainable ML model was used to reveal novel biomarkers (f), assist hypothesis testing (g), and gain insights into risk factors and assist treatment approaches (h).
Figure 2
Figure 2
Explainable ML reveals novel biomarkers associated with diabetes. (a) Top 15 strongest biomarkers associated with incident T2D. The ranking is based on the mean absolute Shapley values (see Methods for details). Colors indicate the values of biomarkers categorized into four quartiles. A positive change in log odds indicates a higher risk of T2D. (b) The relationship between the top 15 biomarkers and the risk of T2D. The change in log odds is indicated on the y-axis, while the change in biomarker values is shown on the x-axis. The black dashed lines represent smooth curves across the data points to illustrate the overall shape of the relationship. Dot points indicate data from 536 participants.
Figure 3
Figure 3
Comparison of typical scenarios leading to different outcomes at follow-up assessment. (a and b) Comparison of two participants with serum low glucose concentrations at baseline. (c and d) Comparison of two participants with high serum glucose concentrations at baseline. Red and blue colors indicate increased or decreased risk, respectively. The baseline risk for all participants was calculated as the mean of the model output, indicated by green dashed lines (log odds = −0.193), while the final risk outcome is indicated by red dashed lines after adding the additive contributions of all other risk factors.

References

    1. Reutrakul S, Mokhlesi B. Obstructive sleep apnea and diabetes: a state of the art review. Chest. 2017;152(5):1070–1086. doi: 10.1016/j.chest.2017.05.009 - DOI - PMC - PubMed
    1. Tasali E, Mokhlesi B, Van Cauter E. Obstructive sleep apnea and type 2 diabetes: interacting epidemics. Chest. 2008;133(2):496–506. doi: 10.1378/chest.07-0828 - DOI - PubMed
    1. Tomic D, Shaw JE, Magliano DJ. The burden and risks of emerging complications of diabetes mellitus. Nat Rev Endocrinol. 2022;18(9):525–539. doi: 10.1038/s41574-022-00690-7 - DOI - PMC - PubMed
    1. Pamidi S, Tasali E. Obstructive sleep apnea and type 2 diabetes: is there a link? Front Neurol. 2012;3. doi: 10.3389/fneur.2012.00126. - DOI - PMC - PubMed
    1. Kendzerska T, Gershon AS, Hawker G, Tomlinson G, Leung RS. Obstructive sleep apnea and incident diabetes a historical cohort study. Am J Respir Crit Care Med. 2014;190(2):218–225. doi: 10.1164/rccm.201312-2209OC - DOI - PubMed

LinkOut - more resources