Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 21;20(1):10.
doi: 10.1186/s13062-025-00601-6.

Uncovering glycolysis-driven molecular subtypes in diabetic nephropathy: a WGCNA and machine learning approach for diagnostic precision

Affiliations

Uncovering glycolysis-driven molecular subtypes in diabetic nephropathy: a WGCNA and machine learning approach for diagnostic precision

Chenglong Fan et al. Biol Direct. .

Abstract

Introduction: Diabetic nephropathy (DN) is a common diabetes-related complication with unclear underlying pathological mechanisms. Although recent studies have linked glycolysis to various pathological states, its role in DN remains largely underexplored.

Methods: In this study, the expression patterns of glycolysis-related genes (GRGs) were first analyzed using the GSE30122, GSE30528, and GSE96804 datasets, followed by an evaluation of the immune landscape in DN. An unsupervised consensus clustering of DN samples from the same dataset was conducted based on differentially expressed GRGs. The hub genes associated with DN and glycolysis-related clusters were identified via weighted gene co-expression network analysis (WGCNA) and machine learning algorithms. Finally, the expression patterns of these hub genes were validated using single-cell sequencing data and quantitative real-time polymerase chain reaction (qRT-PCR).

Results: Eleven GRGs showed abnormal expression in DN samples, leading to the identification of two distinct glycolysis clusters, each with its own immune profile and functional pathways. The analysis of the GSE142153 dataset showed that these clusters had specific immune characteristics. Furthermore, the Extreme Gradient Boosting (XGB) model was the most effective in diagnosing DN. The five most significant variables, including GATM, PCBD1, F11, HRSP12, and G6PC, were identified as hub genes for further investigation. Single-cell sequencing data showed that the hub genes were predominantly expressed in proximal tubular epithelial cells. In vitro experiments confirmed the expression pattern in NC.

Conclusion: Our study provides valuable insights into the molecular mechanisms underlying DN, highlighting the involvement of GRGs and immune cell infiltration.

Keywords: Diabetic nephropathy; Glycolysis; Glycolysis-related genes; Hub genes; Machine learning algorithm.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: In this study, we confirm that all experiments and methods were conducted strictly in accordance with relevant guidelines and regulations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
A flow chart of the study
Fig. 2
Fig. 2
The mode of GRGs expression in DN. A Box plot showing the differential expression of 32 GRGs between NC and DN samples. *P < 0.05, **P < 0.01, ***P < 0.001. B The relative expression calorigrams of 32 differentially expressed GRGs, *P < 0.05, **P < 0.01, ***P < 0.001. C The chromosomal locations of 32 differentially expressed GRGs
Fig. 3
Fig. 3
Cluster analysis of differentially expressed GRGs in DN samples. A When k = 2, the sample was divided into 2 distinct clusters. B Consensus clustering CDF when k = 2 ~ 9. C A tracer showing the clustering results for each sample at different k values. (2–9) D Calculate a consistent clustering score when the value of k varies systematically from 2 to 9. (E) PCA analysis visually illustrates the distribution of two identified unsupervised consensus clusters of glycolytic clusters
Fig. 4
Fig. 4
Differences in expression patterns of GRGs in two unsupervised consensus clusters. A Box plots displaying GRGs with differential expression between two glycolytic groups. B Heat maps showing the relative expression levels of 11 CRGs in glycolytic clusters C1 and C2. C GSVA enrichment analysis based on the HALLMARK pathway among samples of glycolytic clusters C1 and C2, sorted by T-value. *P < 0.05, **P < 0.01, ***P < 0.001
Fig. 5
Fig. 5
Construction and module analysis of WGCNA. A Network topology analysis under different soft threshold powers. B Clustering Dendrogram, illustrating the hierarchical grouping of genes by topological overlap, with the specified module colors representing different gene clusters. C Correlation analysis for the relationship between different coexpression modules and clinical features. D Correlation between brown module members and DN
Fig. 6
Fig. 6
Residual and performance assessment of machine learning models on different feature sets. A Cumulative residual Distribution: The reverse cumulative distribution of residual for four machine learning models (XGB, RF, SVM, and GLM). The curve displays differences in accuracy of the diverse models in fitting the data. B Residual box plot: Comparison of the residual distribution of the four models. The red dots represent the residual root-mean-square error (RMSE) of each model. C The top ten variables in the RMSE ranking used for evaluating the feature importance of the models (GLM, RF, SVM, XGB), and the significance contribution of each model to the input features was analyzed. D The ROC curves for RF, SVM, XGB, and GLM models and their corresponding AUC values. E ROC curve and AUC values obtained by the XGB model were verified using the GSE142153 dataset
Fig. 7
Fig. 7
Characterization of cell populations and gene expression patterns in DN and NC samples through scRNA-seq data. A UMAP displaying the cellular gene expression profiles of NC and DN samples in the dataset GSE183276. B. Annotations and visualizations illustrating the cell clusters based on the expression profiles. C Histogram showing the proportion of major cell types in DN and NC samples. D Differential expression of GATM in 13 cell clusters (E) Differential expression of G6PC in 13 cell clusters (F) Differential expression of HRSP12 in 13 cell clusters (G) Differential expression of PCBD1 in 13 cell clusters (H) Differential expression of F11 in 13 cell clusters (I) illustrating the spatial coordinate system, the regional distribution of different cell clusters and their corresponding AUC values. J Violin plot showing the distribution density of AUC values by cell type
Fig. 8
Fig. 8
Validation of hub genes in in vitro hyperglycemic cell models: qRT-PCR validation of GATM, PCBD1, F11, HRSP12, and G6PC expression between DN patients and NC individuals. *P < 0.05, **P < 0.01

Similar articles

Cited by

References

    1. Argyropoulos C, et al. Urinary MicroRNA profiling predicts the development of microalbuminuria in patients with type 1 diabetes. JCM. 2015;4(7):1498–517. 10.3390/jcm4071498. - PMC - PubMed
    1. Samsu N. Diabetic nephropathy: challenges in pathogenesis, diagnosis, and treatment. Biomed Res Int. 2021;2021:1–17. 10.1155/2021/1497449. - PMC - PubMed
    1. Zhang L, et al. Trends in chronic kidney disease in China. N Engl J Med. 2016;375(9):905–6. 10.1056/NEJMc1602469. - PubMed
    1. Duman TT, Ozkul FN, Balci B. Could systemic inflammatory index predict diabetic kidney injury in type 2 diabetes mellitus? Diagnostics. 2023;13(12):2063. 10.3390/diagnostics13122063. - PMC - PubMed
    1. Sum SLW, Shi Y. The glycolytic process in endothelial cells and its implications. Acta Pharmacol Sin. 2022;43(2):251–9. 10.1038/s41401-021-00647-y. - PMC - PubMed

MeSH terms

LinkOut - more resources