Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 18;17(1):227.
doi: 10.1186/s13098-025-01786-6.

Machine learning-based stratification of prediabetes and type 2 diabetes progression

Affiliations

Machine learning-based stratification of prediabetes and type 2 diabetes progression

Marwa Matboli et al. Diabetol Metab Syndr. .

Abstract

Background: Diabetes mellitus, a global health concern with severe complications, demands early detection and precise staging for effective management. Machine learning approaches, combined with bioinformatics, offer promising avenues for enhancing diagnostic accuracy and identifying key biomarkers.

Methods: This study employed a multi-class classification framework to classify patients across four health states: healthy, prediabetes, type 2 Diabetes Mellitus (T2DM) without complications, and T2DM with complications. Three models were developed using molecular markers, biochemical markers, and a combined model of both. Five machine learning classifiers were applied: Random Forest (RF), Extra Tree Classifier, Quadratic Discriminant Analysis, Naïve Bayes, and Light Gradient Boosting Machine. To improve the robustness and precision of the classification, Recursive Feature Elimination with Cross-Validation (RFECV) and a fivefold cross-validation were used. The multi-class classification approach enabled effective discrimination between the four diabetes stages.

Results: The top contributing features identified for the combined model through RFECV included three molecular markers-miR342, NFKB1, and miR636-and two biochemical markers the albumin-to-creatinine ratio and HDLc, indicating their strong association with diabetes progression. The Extra Trees Classifier achieved the highest performance across all models, with an AUC value of 0.9985 (95% CI: [0.994-1.000]). This classifier outperformed other models, demonstrating its robustness and applicability for precise diabetes staging.

Conclusion: These findings underscore the value of integrating machine learning with molecular and biochemical markers for the accurate classification of diabetes stages, supporting a potential shift toward more personalized diabetes management.

Keywords: Diabetes mellitus; Extra tree classifier; Machine learning; RNA; T2DM.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The study was reviewed according to the guidelines of the Declaration of Helsinki and received approval from the Research Ethics Committee, Faculty of Medicine, Ain Shams University, Egypt, FWA000017585/FAMSU P28/2022. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Blueprint of the study design
Fig. 2
Fig. 2
Summary of the machine learning Workflow
Fig. 3
Fig. 3
RNA panel differential expression across the four studied groups
Fig. 4
Fig. 4
Show the correlation heatmap of T2DM dataset features
Fig. 5
Fig. 5
Feature Selection Performance Using RFECV. (A) Molecular, (B) Biochemical, (C) Combined set
Fig. 6
Fig. 6
Feature importance for the combined feature group
Fig. 7
Fig. 7
Confusion Matrix for top classifier prediction for each feature group. (A) Molecular, (B) Biochemical, (C) Combined
Fig. 8
Fig. 8
ROC curve for the top-performing classifier for each feature set. (A) Molecular, (B) Biochemical, (C) Combined

Similar articles

References

    1. Abdallah SM, Ayoub AI, Makhlouf MM, Ashour A. Diabetes knowledge, health literacy and diabetes self-care among older adults living with diabetes in Alexandria Egypt. BMC Pub Health. 2024;24(1):2848. - PMC - PubMed
    1. Butt MD, Ong SC, Rafiq A, Kalam MN, Sajjad A, Abdullah M, Malik T, Yaseen F, Babar ZU. A systematic review of the economic burden of diabetes mellitus: contrasting perspectives from high and low middle-income countries. J Pharm Policy Pract. 2024;17(1):2322107. - PMC - PubMed
    1. Soliman AR, Hegazy M, Ahmed RM, Abdelghaffar S, Gomaa M, Alwakil S, Soliman D, Sedky L, Shaltout I. Dietary recommendations for people with diabetes in special situations: a position statement report by Arabic association for the study of diabetes and metabolism (AASD). J Health Popul Nutr. 2024;43(1):139. - PMC - PubMed
    1. Federation ID. IDF diabetes atlas, tenth. International Diabetes. 2021.
    1. Fowler MJ. Microvascular and macrovascular complications of diabetes. Clinical diabetes. 2008;26(2):77–82.

LinkOut - more resources