Review

. 2023 Sep 25;22(1):259.

doi: 10.1186/s12933-023-01985-3.

Machine learning in precision diabetes care and cardiovascular risk prediction

Evangelos K Oikonomou¹, Rohan Khera^{2

3

4

5}

Affiliations

¹ Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
² Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA. rohan.khera@yale.edu.
³ Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA. rohan.khera@yale.edu.
⁴ Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT, USA. rohan.khera@yale.edu.
⁵ Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, 195 Church St, 6th floor, New Haven, CT, 06510, USA. rohan.khera@yale.edu.

PMID: 37749579
PMCID: PMC10521578
DOI: 10.1186/s12933-023-01985-3

Review

Machine learning in precision diabetes care and cardiovascular risk prediction

Evangelos K Oikonomou et al. Cardiovasc Diabetol. 2023.

. 2023 Sep 25;22(1):259.

doi: 10.1186/s12933-023-01985-3.

Authors

Evangelos K Oikonomou¹, Rohan Khera^{2

3

4

5}

Affiliations

¹ Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
² Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA. rohan.khera@yale.edu.
³ Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA. rohan.khera@yale.edu.
⁴ Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT, USA. rohan.khera@yale.edu.
⁵ Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, 195 Church St, 6th floor, New Haven, CT, 06510, USA. rohan.khera@yale.edu.

PMID: 37749579
PMCID: PMC10521578
DOI: 10.1186/s12933-023-01985-3

Abstract

Artificial intelligence and machine learning are driving a paradigm shift in medicine, promising data-driven, personalized solutions for managing diabetes and the excess cardiovascular risk it poses. In this comprehensive review of machine learning applications in the care of patients with diabetes at increased cardiovascular risk, we offer a broad overview of various data-driven methods and how they may be leveraged in developing predictive models for personalized care. We review existing as well as expected artificial intelligence solutions in the context of diagnosis, prognostication, phenotyping, and treatment of diabetes and its cardiovascular complications. In addition to discussing the key properties of such models that enable their successful application in complex risk prediction, we define challenges that arise from their misuse and the role of methodological standards in overcoming these limitations. We also identify key issues in equity and bias mitigation in healthcare and discuss how the current regulatory framework should ensure the efficacy and safety of medical artificial intelligence products in transforming cardiovascular care and outcomes in diabetes.

Keywords: Artificial intelligence; Cardiovascular disease; Diabetes; Digital health; Machine learning; Personalized medicine; Prediction.

PubMed Disclaimer

Conflict of interest statement

E.K.O and R.K. are co-inventors of the U.S. Patent Applications 63/508,315 and 63/177,117 and co-founders of Evidence2Health, a health analytics company to improve evidence-based cardiovascular care. E.K.O. reports a consultancy and stock option agreement with Caristo Diagnostics Ltd (Oxford, U.K.), unrelated to the current work. R.K. received support from the National Heart, Lung, and Blood Institute of the National Institutes of Health (under award K23HL153775) and the Doris Duke Charitable Foundation (under award 2022060). R.K. further receives research support, through Yale, from Bristol-Myers Squibb and Novo Nordisk, unrelated to current work. He is a coinventor of U.S. Pending Patent Applications 63/428,569 and 63/346,610, unrelated to the current work. He is an Associate Editor at JAMA.

Figures

**Fig. 1**
Overview of commonly used algorithms in medical machine learning

**Fig. 2**
Discrimination, calibration, and net clinical benefit. The comprehensive evaluation of a predictive model requires the simultaneous evaluation of its discrimination, calibration, and incremental value beyond the current standard-of-care. A The area under the receiver operating characteristic curve (AUROC) reflects the trade-off between sensitivity (true positive rate) and specificity (1-false positive rate) at different thresholds and provides a measure of separability, in other words the ability of the model to distinguish between classes (0.5 = no separation, 1 = perfect separation). B Models with similar AUROC may exhibit different behavior when the prevalence of the label varies. The precision–recall curve demonstrates the trade-off between the positive predictive value (*precision*) and sensitivity (*recall*), and illustrates how the area under the curve may vary substantially as the prevalence of the label of interest decreases from 50 to 5%. C Models with similar AUROC may also differ in their calibration. A model with good calibration (i.e. blue line) makes probabilistic predictions that match real world probabilities. On the other hand, the model shown in orange underestimates and overestimates risk at lower and higher prediction thresholds, respectively. D Finally, models should be compared against established standard-of-cares while incorporating clinical consequences and comparing the net clinical benefit across varying risk levels to established or no risk stratification approaches. Curves were generated using synthetic datasets for illustration purposes

**Fig. 3**
Phenomapping-derived tools for personalized effect estimates. Phenomaps enable a visual and topological representation of the baseline phenotypic variance of a trial population while accounting for many pre-randomization features. As shown in an analysis of the Canagliflozin Cardiovascular Assessment (CANVAS) trial [138], a phenomap representation of all enrolled patients shows that the study arms are randomly distributed in the phenotypic space (A). Through a series of iterative analyses centered around each patient’s unique phenotypic location, a machine learning model can learn phenotypic signatures associated with distinct responses to canagliflozin versus placebo therapy (B, C). An extreme gradient boosting algorithm trained to describe this heterogeneity in treatment effect in CANVAS successfully stratified the independent CANVAS-R population into high- (D) and low-responders (E). Panels reproduced with permission from Oikonomou et al. [12]

**Fig. 4**
Machine learning for predictive enrichment of randomized control trials. Machine learning can be used to guide adaptive clinical trial design though data-driven inference and predictive enrichment. Traditional fixed trial designs do not allow modifications in the patient population, whereas sample size adaptations only allow interim revisions in the power calculations and target sample sizes based on the accumulating rate of primary outcome and safety events. In trials whereas there happens to be clinically meaningful heterogeneity in the treatment effect, a priori inclusion of machine learning, data-driven inference may provide early signals of heterogeneous benefit or harm and a reference for adaptive predictive enrichment. This approach can optimize the trial’s efficacy, shorten its duration, minimize its costs, maximize inference, and ultimately ensure safety for the study participants. ML machine learning

**Fig. 5**
Explainability and interpretability of medical machine learning. Broadly speaking, more complex algorithms demonstrate better performance when dealing with complex tasks and data inputs. For instance, the recognition of cardiomyopathy using echocardiographic videos may require a deep learning algorithm to model the full extent of temporal and spatial features that carry diagnostic value, whereas predicting the risk of re-admission using electronic health record data may be modelled using generalized linear models. Simpler models, such as decision trees and linear models are intuitive and interpretable, whereas ensemble and neural network-based methods are too complex for the human mind to fully understand. Explainable artificial intelligence (XAI) methods aim to bridge this interpretability gap by offering direct or indirect insights into the inner workings of complex algorithms

See this image and copyright information in PMC

Cited by

CarDS-Plus ECG Platform: Development and Feasibility Evaluation of a Multiplatform Artificial Intelligence Toolkit for Portable and Wearable Device Electrocardiograms.
Shankar SV, Oikonomou EK, Khera R. Shankar SV, et al. medRxiv [Preprint]. 2023 Oct 3:2023.10.02.23296404. doi: 10.1101/2023.10.02.23296404. medRxiv. 2023. PMID: 37873174 Free PMC article. Preprint.
Prediction model for type 2 diabetes mellitus and its association with mortality using machine learning in three independent cohorts from South Korea, Japan, and the UK: a model development and validation study.
Lee H, Hwang SH, Park S, Choi Y, Lee S, Park J, Son Y, Kim HJ, Kim S, Oh J, Smith L, Pizzol D, Rhee SY, Sang H, Lee J, Yon DK. Lee H, et al. EClinicalMedicine. 2025 Jan 18;80:103069. doi: 10.1016/j.eclinm.2025.103069. eCollection 2025 Feb. EClinicalMedicine. 2025. PMID: 39896872 Free PMC article.
Chronic kidney disease and dementia: an epidemiological perspective.
Ikram MA. Ikram MA. Nat Rev Nephrol. 2025 Aug;21(8):525-535. doi: 10.1038/s41581-025-00967-w. Epub 2025 May 22. Nat Rev Nephrol. 2025. PMID: 40404981 Review.
A methodological showcase: utilizing minimal clinical parameters for early-stage mortality risk assessment in COVID-19-positive patients.
Yan JK. Yan JK. PeerJ Comput Sci. 2024 Apr 30;10:e2017. doi: 10.7717/peerj-cs.2017. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 38855224 Free PMC article.
Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction.
Mohanty PK, Francis SAJ, Barik RK, Roy DS, Saikia MJ. Mohanty PK, et al. Bioengineering (Basel). 2024 Nov 30;11(12):1215. doi: 10.3390/bioengineering11121215. Bioengineering (Basel). 2024. PMID: 39768033 Free PMC article.

See all "Cited by" articles

References

1. Haug CJ, Drazen JM. Artificial intelligence and machine learning in clinical medicine, 2023. N Engl J Med. 2023;388(13):1201–1208. - PubMed
1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56. - PubMed
1. Joseph JJ, Deedwania P, Acharya T, Aguilar D, Bhatt DL, Chyun DA, et al. Comprehensive management of cardiovascular risk factors for adults with type 2 diabetes: a scientific statement from the American Heart Association. Circulation. 2022;145(9):e722–e759. - PubMed
1. Ong KL, Stafford LK, McLaughlin SA, Boyko EJ, Vollset SE, Smith AE, et al. Global, regional, and national burden of diabetes from 1990 to 2021, with projections of prevalence to 2050: a systematic analysis for the Global Burden of Disease Study 2021. Lancet. 2023 doi: 10.1016/S0140-6736(23)01301-6. - DOI - PMC - PubMed
1. Ravaut M, Harish V, Sadeghi H, Leung KK, Volkovs M, Kornas K, et al. Development and validation of a machine learning model using administrative health data to predict onset of type 2 diabetes. JAMA Netw Open. 2021;4(5):e2111315. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine learning in precision diabetes care and cardiovascular risk prediction

Affiliations

Machine learning in precision diabetes care and cardiovascular risk prediction

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical