Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 11:8:e44018.
doi: 10.2196/44018.

Prediction of Weight Loss to Decrease the Risk for Type 2 Diabetes Using Multidimensional Data in Filipino Americans: Secondary Analysis

Affiliations

Prediction of Weight Loss to Decrease the Risk for Type 2 Diabetes Using Multidimensional Data in Filipino Americans: Secondary Analysis

Lisa Chang et al. JMIR Diabetes. .

Abstract

Background: Type 2 diabetes (T2D) has an immense disease burden, affecting millions of people worldwide and costing billions of dollars in treatment. As T2D is a multifactorial disease with both genetic and nongenetic influences, accurate risk assessments for patients are difficult to perform. Machine learning has served as a useful tool in T2D risk prediction, as it can analyze and detect patterns in large and complex data sets like that of RNA sequencing. However, before machine learning can be implemented, feature selection is a necessary step to reduce the dimensionality in high-dimensional data and optimize modeling results. Different combinations of feature selection methods and machine learning models have been used in studies reporting disease predictions and classifications with high accuracy.

Objective: The purpose of this study was to assess the use of feature selection and classification approaches that integrate different data types to predict weight loss for the prevention of T2D.

Methods: The data of 56 participants (ie, demographic and clinical factors, dietary scores, step counts, and transcriptomics) were obtained from a previously completed randomized clinical trial adaptation of the Diabetes Prevention Program study. Feature selection methods were used to select for subsets of transcripts to be used in the selected classification approaches: support vector machine, logistic regression, decision trees, random forest, and extremely randomized decision trees (extra-trees). Data types were included in different classification approaches in an additive manner to assess model performance for the prediction of weight loss.

Results: Average waist and hip circumference were found to be different between those who exhibited weight loss and those who did not exhibit weight loss (P=.02 and P=.04, respectively). The incorporation of dietary and step count data did not improve modeling performance compared to classifiers that included only demographic and clinical data. Optimal subsets of transcripts identified through feature selection yielded higher prediction accuracy than when all available transcripts were included. After comparison of different feature selection methods and classifiers, DESeq2 as a feature selection method and an extra-trees classifier with and without ensemble learning provided the most optimal results, as defined by differences in training and testing accuracy, cross-validated area under the curve, and other factors. We identified 5 genes in two or more of the feature selection subsets (ie, CDP-diacylglycerol-inositol 3-phosphatidyltransferase [CDIPT], mannose receptor C type 2 [MRC2], PAT1 homolog 2 [PATL2], regulatory factor X-associated ankyrin containing protein [RFXANK], and small ubiquitin like modifier 3 [SUMO3]).

Conclusions: Our results suggest that the inclusion of transcriptomic data in classification approaches for prediction has the potential to improve weight loss prediction models. Identification of which individuals are likely to respond to interventions for weight loss may help to prevent incident T2D. Out of the 5 genes identified as optimal predictors, 3 (ie, CDIPT, MRC2, and SUMO3) have been previously shown to be associated with T2D or obesity.

Trial registration: ClinicalTrials.gov NCT02278939; https://clinicaltrials.gov/ct2/show/NCT02278939.

Keywords: classification; feature selection; obesity; transcriptomics; type 2 diabetes; weight loss.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Venn diagram of overlapping and unique transcripts identified using 4 different feature selection methods. APOBEC3G: Apolipoprotein B MRNA Editing Enzyme Catalytic Subunit 3G; CBX4: Chromobox 4; CDIPT: CDP-Diacylglycerol-Inositol 3-Phosphatidyltransferase; CFS: correlation feature selection; DR1: Down-Regulator Of Transcription 1; IDH1: Isocitrate Dehydrogenase (NADP(+)) 1; MRC2: Mannose Receptor C Type 2; NFIX: Nuclear Factor I X; PATL2: PAT1 Homolog 2; RFXANK: Regulatory Factor X Associated Ankyrin Containing Protein; ST6GALNAC4: ST6 N-Acetylgalactosaminide Alpha-2,6-Sialyltransferase 4; SUMO3: Small Ubiquitin Like Modifier 3; TFG: Trafficking From ER To Golgi Regulator; TMEM86B: Transmembrane Protein 86B.

Similar articles

Cited by

References

    1. Deshpande A, Harris-Hayes M, Schootman M. Epidemiology of diabetes and diabetes-related complications. Phys Ther. 2008 Nov;88(11):1254–64. doi: 10.2522/ptj.20080020. https://europepmc.org/abstract/MED/18801858 ptj.20080020 - DOI - PMC - PubMed
    1. Singer ME, Dorrance KA, Oxenreiter MM, Yan KR, Close KL. The type 2 diabetes 'modern preventable pandemic' and replicable lessons from the COVID-19 crisis. Prev Med Rep. 2022 Feb;25:101636. doi: 10.1016/j.pmedr.2021.101636. https://linkinghub.elsevier.com/retrieve/pii/S2211-3355(21)00327-2 S2211-3355(21)00327-2 - DOI - PMC - PubMed
    1. American Diabetes Association Economic Costs of Diabetes in the U.S. in 2017. Diabetes Care. 2018 Dec;41(5):917–928. doi: 10.2337/dci18-0007.dci18-0007 - DOI - PMC - PubMed
    1. Xie Z, Nikolayeva O, Luo J, Li D. Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques. Prev Chronic Dis. 2019 Sep 19;16:E130. doi: 10.5888/pcd16.190109. https://europepmc.org/abstract/MED/31538566 E130 - DOI - PMC - PubMed
    1. Barnes A. The epidemic of obesity and diabetes: trends and treatments. Tex Heart Inst J. 2011;38(2):142–4. https://europepmc.org/abstract/MED/21494521 - PMC - PubMed

Associated data