Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 15;6(7):102212.
doi: 10.1016/j.xcrm.2025.102212. Epub 2025 Jul 2.

Elucidating the heterogeneity of prediabetes through subphenotyping with a two-dimensional tree structure

Affiliations

Elucidating the heterogeneity of prediabetes through subphenotyping with a two-dimensional tree structure

Hong Lin et al. Cell Rep Med. .

Abstract

Prediabetes, an intermediate stage of developing diabetes, exhibits considerable phenotypic heterogeneity. Here, we apply the Discriminative Dimensionality Reduction Tree (DDRTree) algorithm to explore prediabetes heterogeneity in 55,777 participants from the China Cardiometabolic Disease and Cancer Cohort (4C) study. Based on 12 clinically available variables, we identify four distinct phenotypes and observe differential risks of type 2 diabetes mellitus (T2DM), chronic kidney disease (CKD), and cardiovascular disease (CVD). Phenotype 4, characterized by hyperglycemia, insulin resistance, obesity, elevated triglycerides, and liver enzymes, has the highest T2DM risk, while phenotype 3, predominantly driven by obesity, insulin resistance, hyperglycemia, and dyslipidemia, has the highest CKD risk. Phenotypes 3 and 4 show higher CVD risk, with distinct distributions of CVD subtypes. These findings are validated in the external cohort SN_2009-2021, and a user-friendly online tool is provided for individual risk prediction. Overall, our study elucidates the intricate dynamics of prediabetes progression, aiding in personalized management for prediabetes care.

Keywords: cardiovascular disease; chronic kidney disease; diabetes; heterogeneity; prediabetes; subphenotyping.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
A visual representation of the phenotypic characteristics of 55,777 participants with prediabetes (A) DDRTree was used to reduce the 12 clinical variables (BMI, WHR, HOMA-IR, HOMA-B, ALT, AST, GGT, FPG, PBG, HbA1c, TG, and HDL-C, residualized for age and gender into a non-linear tree structure [N = 55,777]). The values of clinical variables are overlayed on the tree structure to visualize the distribution of each variable over the reduced tree structure. Each point in the figure represents one individual. (B) Linear regression (N = 55,777) estimates (with 95% CI) between the DDRTree dimensions and the 12 clinical variables, showing the association between variables and dimensions. (C) Spatial autocorrelation (N = 55,777 for each spatial correlation analysis) of the 12 variables. The Moran’s I statistic is shown on the x axis, with higher values representing variables that are more strongly autocorrelated; all values were at p < 0.0001. (D) Attribution of numbers to four phenotypes based on four branches (from 1 to 4). See also Table S1 and Figure S1.
Figure 2
Figure 2
Visualizing the heterogeneity in prediabetes progression in participants with prediabetes (A) Predicted probability of incident T2DM (N = 3,615) over a 5-year period (N = 47,692). (B) Probability of incident CKD (N = 1,020) over a 5-year period (N = 37,409). For all outcomes (A and B), probabilities were generated from Cox hazard risk models constructed with DDRTree dimensions. (C) HRs (95% CIs) of DDRTree dimensions for each outcome from Cox hazard risk models. (D) Spatial autocorrelation of incident T2DM and CKD. Moran’s I statistic is shown on the x axis, with higher values representing clinical variables that are more strongly autocorrelated; all values were at p < 0.0001. All predictions are from the models with DDRTree dimensions. See also Figure S1.
Figure 3
Figure 3
Visualizing the heterogeneity in CVD outcomes in participants with prediabetes (A) Predicted probability of total CVD, stroke, MI, and heart failure over a 5-year period (cases/N = 933/47,818, 582/44,066, 142/46,493, and 70/46,270). For all outcomes, probabilities were generated from Cox hazard risk models constructed with DDRTree dimensions. (B) HRs (95% CIs) of DDRTree dimensions for each outcome from Cox hazard risk models. (C) Spatial autocorrelation of CVD outcomes. Moran’s I statistic is shown on the x axis, with higher values representing variables that are more strongly autocorrelated; all values were at p < 0.0001, all predictions are from the models with DDRTree dimensions. See also Figure S1.
Figure 4
Figure 4
A visual representation of the phenotypic characteristics of (N = 1,649) participants with prediabetes at baseline in SN_2009–2021 study (A) The distribution of 12 clinical variables (WHR, BMI, HOMA-IR, HOMA-B, ALT, AST, GGT, FPG, PBG, HbA1c, TG, and HDL-C) over the reduced tree structure. A mapping function was used to position individuals in SN_2009–2021 study (N = 1,649) onto the 4C (reference) tree. Each dot represents the position of one individual from the SN_2009–2021 study. The magenta color of the point indicates higher values of the clinical variable, and the yellow color indicates lower values. (B) Linear regression (N = 1,649) estimates (with 95% CI) between the DDRTree dimensions and the 12 clinical variables, showing the association between variables and dimensions. (C) Spatial autocorrelation (N = 1,649 for each spatial correlation analysis) of the 12 variables. The Moran’s I statistic is shown on the x axis, with higher values representing variables that are more strongly autocorrelated.
Figure 5
Figure 5
Visualizing the heterogeneity in disease progression in SN_2009–2021 study A mapping function was used to position individuals in SN_2009–2021 study (N = 1,649) onto the 4C (reference) tree. Each dot represents the position of one individual from the SN_2009–2021 study. (A) Predicted probability of incident T2DM (N = 40) over a 10-year period (N = 1,631). (B) Predicted probability of incident CVD (N = 394) over a 10-year period (N = 1,646). (C) Predicted probability of CKD (cases/N = 14/1,262). The probability of CKD was generated from logistic regression model constructed with DDRTree dimensions. For T2DM (A) and CVD outcomes (B), probabilities were generated from Cox proportional hazard models constructed with DDRTree dimensions. (D) ORs/HRs (95% CI) of DDRTree dimensions for each outcome. (E) Spatial autocorrelation of three outcomes. Moran’s I statistic is shown on the x axis, with higher values representing variables that are more strongly autocorrelated; all value was at p < 0.0001. All predictions are from the models with DDRTree dimensions.

References

    1. Echouffo-Tcheugui J.B., Perreault L., Ji L., Dagogo-Jack S. Diagnosis and Management of Prediabetes: A Review. JAMA. 2023;329:1206–1216. doi: 10.1001/jama.2023.4063. - DOI - PubMed
    1. Magliano D.J., Boyko E.J., IDF Diabetes Atlas Committee . 10th ed. International Diabetes Federation; 2021. IDF Diabetes Atlas. - PubMed
    1. Lu J., He J., Li M., Tang X., Hu R., Shi L., Su Q., Peng K., Xu M., Xu Y., et al. Predictive Value of Fasting Glucose, Postload Glucose, and Hemoglobin A(1c) on Risk of Diabetes and Complications in Chinese Adults. Diabetes Care. 2019;42:1539–1548. doi: 10.2337/dc18-1390. - DOI - PubMed
    1. Li G., Zhang P., Wang J., Gregg E.W., Yang W., Gong Q., Li H., Li H., Jiang Y., An Y., et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet. 2008;371:1783–1789. doi: 10.1016/s0140-6736(08)60766-7. - DOI - PubMed
    1. Tabák A.G., Herder C., Rathmann W., Brunner E.J., Kivimäki M. Prediabetes: a high-risk state for diabetes development. Lancet. 2012;379:2279–2290. doi: 10.1016/s0140-6736(12)60283-9. - DOI - PMC - PubMed