Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 7;21(1):342.
doi: 10.1186/s12916-023-03045-9.

Detection of diabetic patients in people with normal fasting glucose using machine learning

Affiliations

Detection of diabetic patients in people with normal fasting glucose using machine learning

Kun Lv et al. BMC Med. .

Abstract

Background: Diabetes mellitus (DM) is a chronic metabolic disease that could produce severe complications threatening life. Its early detection is thus quite important for the timely prevention and treatment. Normally, fasting blood glucose (FBG) by physical examination is used for large-scale screening of DM; however, some people with normal fasting glucose (NFG) actually have suffered from diabetes but are missed by the examination. This study aimed to investigate whether common physical examination indexes for diabetes can be used to identify the diabetes individuals from the populations with NFG.

Methods: The physical examination data from over 60,000 individuals with NFG in three Chinese cohorts were used. The diabetes patients were defined by HbA1c ≥ 48 mmol/mol (6.5%). We constructed the models using multiple machine learning methods, including logistic regression, random forest, deep neural network, and support vector machine, and selected the optimal one on the validation set. A framework using permutation feature importance algorithm was devised to discover the personalized risk factors.

Results: The prediction model constructed by logistic regression achieved the best performance with an AUC, sensitivity, and specificity of 0.899, 85.0%, and 81.1% on the validation set and 0.872, 77.9%, and 81.0% on the test set, respectively. Following feature selection, the final classifier only requiring 13 features, named as DRING (diabetes risk of individuals with normal fasting glucose), exhibited reliable performance on two newly recruited independent datasets, with the AUC of 0.964 and 0.899, the balanced accuracy of 84.2% and 81.1%, the sensitivity of 100% and 76.2%, and the specificity of 68.3% and 86.0%, respectively. The feature importance ranking analysis revealed that BMI, age, sex, absolute lymphocyte count, and mean corpuscular volume are important factors for the risk stratification of diabetes. With a case, the framework for identifying personalized risk factors revealed FBG, age, and BMI as significant hazard factors that contribute to an increased incidence of diabetes. DRING webserver is available for ease of application ( http://www.cuilab.cn/dring ).

Conclusions: DRING was demonstrated to perform well on identifying the diabetes individuals among populations with NFG, which could aid in early diagnosis and interventions for those individuals who are most likely missed.

Keywords: Diabetes risk prediction; Machine learning; Missed diagnosis; Normal fasting glucose.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the DRING approach
Fig. 2
Fig. 2
The top 5 features exhibiting the most significant differences between diabetic and non-diabetic individuals with normal fasting glucose
Fig. 3
Fig. 3
Performance of predictive model on the validation set and the test set from D1 dataset. A ROC curve. B Precision-recall curve
Fig. 4
Fig. 4
Feature selection for the final model using mRMR and manual curation. A ROC curve of the models constructed by mRMR- and manual curation- selected features on the validation set and the test set of dataset D1. B Precision-recall curve of above models. C Comparison of sensitivity, specificity, and balanced accuracy on test set between the model constructed before and after feature selection. D ROC curve of models using selected features on the two newly recruited independent test sets. E Precision-recall curve of above models. F Comparison of other metrics on the two newly recruited independent test sets
Fig. 5
Fig. 5
Feature importance ranking of the models constructed by the features from manual curation method
Fig. 6
Fig. 6
Screening of risk factors for incident diabetes on a case study

References

    1. Kharroubi AT, Darwish HM. Diabetes mellitus: the epidemic of the century. World J Diabetes. 2015;6:850–867. doi: 10.4239/wjd.v6.i6.850. - DOI - PMC - PubMed
    1. Federation ID . IDF Diabetes Atlas. 10th 2021. - PubMed
    1. Enzo B, Maddalena T, Marco D, Daniela T, Vittorio C, Corinna B, et al. Chronic complications in patients with newly diagnosed type 2 diabetes: prevalence and related metabolic and clinical features: the Verona Newly Diagnosed Type 2 Diabetes Study (VNDS) 9. BMJ Open Diabetes Res Care. 2020;8:e001549. doi: 10.1136/bmjdrc-2020-001549. - DOI - PMC - PubMed
    1. Susan van D, Beulens JWJ, Yvonne T. van der S, Grobbee DE, Nealb B. The global burden of diabetes and its complications: an emerging pandemic. Eur J Cardiovasc Prev Rehabil. 2010;17:s3-s8. - PubMed
    1. Dunachie S, Chamnan P. The double burden of diabetes and global infection in low and middle-income countries. Trans R Soc Trop Med Hyg. 2019;113:56–64. doi: 10.1093/trstmh/try124. - DOI - PMC - PubMed

Publication types