Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun:235:107537.
doi: 10.1016/j.cmpb.2023.107537. Epub 2023 Apr 5.

A machine learning-based diagnosis modelling of type 2 diabetes mellitus with environmental metal exposure

Affiliations

A machine learning-based diagnosis modelling of type 2 diabetes mellitus with environmental metal exposure

Min Zhao et al. Comput Methods Programs Biomed. 2023 Jun.

Abstract

Background and objective: Increasing and compelling evidence has been proved that urinary and dietary metal exposure are underappreciated but potentially modifiable biomarkers for type 2 diabetes mellitus (T2DM). The aims of this study were (1) to identify the key potential biomarkers which contributed to T2DM with effective and parsimonious features and (2) to assess the utility of baseline variables and metal exposure in the diagnosis of T2DM.

Methods: Based on the National Health and Nutrition Examination Survey (NHANES), we selected 9822 screening records with 82 significant variables covering demographics, lifestyle, anthropometric measures, diet and metal exposure for this study. Combining extreme gradient boosting (XGBoost), random forest and light gradient boosting machine (lightGBM), a soft voting ensemble model was proposed to measure the importance of 82 features. With this soft voting ensemble model and variance inflation factor (VIF), strong multicollinear features with low importance scores were further removed from candidate biomarkers. Then, a soft voting ensemble classifier was adopted to demonstrate the efficiency of the proposed feature selection method.

Results: With the novel feature selection method, 12 baseline variables and 3 metal variables were selected to detect patients at risk for T2DM in our study. For metal variables, the dietary copper (Cu), urinary cadmium (Cd) and urinary mercury (Hg) metals were selected as the most remarkable metal exposure and the corresponding P-values were all less than 0.05. In a classification model of T2DM with 12 baseline biomarkers, the addition of 3 metal exposure improved the classification accuracy of T2DM from a traditional area under the curve (AUC) 0.792 of the receiver operating characteristic (ROC) to an AUC 0.847.

Conclusions: This was the first demonstration of T2DM classification with machine learning under urinary and dietary metal exposure. Improved prediction precision illustrated the effectiveness of the proposed machine learning-based diagnosis model facilitated lifestyle/dietary intervention for T2DM prevention.

Keywords: Environmental metal exposure; Machine learning; Statistical analysis; Type 2 diabetes mellitus.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Similar articles

Cited by

LinkOut - more resources