Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 5;4(8):e0000702.
doi: 10.1371/journal.pdig.0000702. eCollection 2025 Aug.

Development and Validation of DIANA (Diabetes Novel Subgroup Assessment tool): A web-based precision medicine tool to determine type 2 diabetes endotype membership and predict individuals at risk of microvascular disease

Affiliations

Development and Validation of DIANA (Diabetes Novel Subgroup Assessment tool): A web-based precision medicine tool to determine type 2 diabetes endotype membership and predict individuals at risk of microvascular disease

Viswanathan Baskar et al. PLOS Digit Health. .

Abstract

Background: Previous research has identified four distinct endotypes of type 2 diabetes in Asian Indians, which include Severe Insulin Deficient Diabetes (SIDD), Combined Insulin Resistant and Deficient Diabetes (CIRDD), Insulin Resistance and Obese Diabetes (IROD), and Mild Age-related Diabetes (MARD). DIANA (Diabetes Novel Subgroup Assessment) is an online precision medicine tool that can predict endotype membership of type 2 diabetes and individual risk for retinopathy and nephropathy.

Methodology: The DIANA tool determines subgroup membership using a machine learning model (support vector machine) on T2D subgroups in the Asian Indian population. We used a support vector machine (SVM) model to classify type 2 diabetes patient endotypes, and the model is trained based on k-fold cross-validation. Its performance was compared with an algorithm determined based on conditional pre-determined cut-offs and weights for each clinical feature [age at diagnosis, BMI, waist, HbA1c, Serum Triglycerides, HDL-Cholesterol, (C-peptide fasting, C-peptide stimulated) - optional. This study employed local interpretable model-agnostic explanations (LIME) and SHapley Additive exPlanations (SHAP) to demystify the endotype prediction model. A random forest model was built to assess an individual's risk for nephropathy and retinopathy based on individual risk algorithms.

Findings: The SVM model has relatively high accuracy, specificity, sensitivity, and precision values compared to conditional pre-determined cut-offs 98% vs 63.6%, 99.8% vs 88%, 98.5% vs 65.1%, and 98.7% vs 63.4%. Clinician face value validation of the prediction by the SVM model reported an accuracy, specificity, sensitivity and precision compared to conditional pre-determined cut-offs 97% vs 85%, 95.3% vs 63%, 95.8% vs 73%, and 98.9% vs 66.9%. Additionally, our study demonstrated the impact of features on ML models through LIME and SHAP analyses. The accuracy of the random forest risk prediction model for nephropathy and retinopathy was 89.6% (p < 0.05) and 78.4% (p < 0.05), respectively.

Conclusion: We conclude that, DIANA is an accurate, clinically explainable AI tool that clinicians can use to make informed decisions on risk assessment and provide precision management to individuals with new-onset type 2 diabetes.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Data selection for microvascular risk prediction model.
The figure outlines the data selection process for nephropathy (A) and retinopathy (B) datasets from 80,118 patients from 6,30,000 records from DEMR with complete baseline parameters, including age, sex, age at diabetes diagnosis, BMI, waist circumference, HbA1c, serum lipids, creatinine, blood pressure, and retinopathy examination.
Fig 2
Fig 2. Validation of SVM and PDCA endotype classification using K-means clustering and model performance comparison.
The figure illustrates the validation of SVM and PDCA in classifying diabetes endotypes using a k-means clustering approach into four subgroups: SIDD (severe insulin-deficient diabetes), IROD (insulin-resistant obese diabetes), CIRDD (combined insulin-resistant and deficient diabetes), and MARD (mild age-related diabetes), in a validation cohort of 19,084 individuals.
Fig 3
Fig 3. Clinician face value validation of SVM and PDCA for endotype classification.
This figure presents the face value validation of the predicted T2D endotypes by SVM and PDCA using a subset cohort of 450 individuals, classified into four groups: SIDD (severe insulin-deficient diabetes), IROD (insulin-resistant obese diabetes), CIRDD (combined insulin-resistant and deficient diabetes), and MARD (mild age-related diabetes).

Similar articles

References

    1. Atlas G. Diabetes. International diabetes federation. IDF Diabetes Atlas; 2021.
    1. Anjana RM, Unnikrishnan R, Deepa M, Pradeepa R, Tandon N, Das AK, et al. Metabolic non-communicable disease health report of India: the ICMR-INDIAB national cross-sectional study (ICMR-INDIAB-17). Lancet Diabetes Endocrinol. 2023;11(7):474–89. doi: 10.1016/S2213-8587(23)00119-5 - DOI - PubMed
    1. Mohan V, Deepa M, Anjana RM, Lanthorn H, Deepa R. Incidence of diabetes and pre-diabetes in a selected urban South Indian population (CUPS-19). J Assoc Physic India. 2008;56:152–7. - PubMed
    1. Mohan V, Ramachandran A, Snehalatha C, Mohan R, Bharani G, Viswanathan M. High prevalence of maturity-onset diabetes of the young (MODY) among Indians. Diabetes Care. 1985;8(4):371–4. - PubMed
    1. Anjana RM, Baskar V, Nair AT, Jebarani S, Siddiqui MK, Pradeepa R, et al. Novel subgroups of type 2 diabetes and their association with microvascular outcomes in an Asian Indian population: a data-driven cluster analysis: the INSPIRED study. BMJ Open Diabetes Research & Care. 2020;8:e001506. - PMC - PubMed

LinkOut - more resources