Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 19;15(1):13519.
doi: 10.1038/s41598-025-97623-x.

Integrative analysis of signaling and metabolic pathways, immune infiltration patterns, and machine learning-based diagnostic model construction in major depressive disorder

Affiliations

Integrative analysis of signaling and metabolic pathways, immune infiltration patterns, and machine learning-based diagnostic model construction in major depressive disorder

Lei Tang et al. Sci Rep. .

Abstract

Major depressive disorder (MDD) is a multifactorial disorder involving genetic and environmental factors, with unclear pathogenesis. This study aims to explore the pathogenic pathway of MDD and its relationship with immune responses and to discover its potential targets by bioinformatics methods. We first applied gene set variation analysis (GSVA) and seven different immune infiltration algorithms to the GSE98793 dataset to determine the differences in signaling pathways, metabolic pathways, and immune cell infiltration between MDD patients and healthy controls. Differentially expressed genes between MDD patients and controls were obtained from five datasets (GSE98793, GSE32280, GSE38206, GSE39653, and GSE52790), and 113 machine learning methods were employed to construct MDD diagnostic models. Based on the constructed MDD diagnostic models, MDD patients were divided into high-risk and low-risk groups. GSVA and immune microenvironment analyses were conducted to investigate the differences between the two groups. Furthermore, potential drugs and therapeutic targets for the high-risk MDD group were explored to provide new insights and directions for the precise treatment of MDD. GSVA and immune infiltration results indicate that patients with MDD exhibit differences from normal individuals in various aspects, including biological processes, signaling pathways, metabolic processes, and immune cells. To investigate the functions and biological significance of differentially expressed genes in MDD patients, we performed GO and KEGG enrichment analyses on the differentially expressed genes from five databases (GSE98793, GSE32280, GSE38206, GSE39653, and GSE52790). By comparing the enrichment results across the five datasets, we found that the cell-killing signaling pathway was consistently present in the enriched signaling pathways of all datasets, suggesting that this pathway may play a crucial role in the pathogenesis of MDD. The random forest algorithm (AUC = 0.788) was selected as the optimal algorithm from 113 machine learning algorithms, leading to the development of a robust and predictive MDD algorithm, highlighting the important role of NPL in MDD. By dividing MDD into high and low-risk subgroups based on diagnostic model scores, enrichment pathways, and immunological results further demonstrated that high-risk MDD is associated with increased levels of reactive oxygen species, inflammation, and numbers of T cells and B cells. Through GSEA scoring, five upregulated pathways in the high-risk MDD group were identified, and multiple potential drugs such as Mibefradil, LY364947, ZLN005, STA- 5326, and vemurafenib were screened. Patients with MDD show differences in signaling pathways, metabolic pathways, and immune mechanisms. By constructing an MDD diagnostic model, we predicted the key genes of MDD and the characteristic pathways associated with a higher risk of MDD. This provides new insights for risk stratification identification and offers new perspectives for the clinical application of precision immunotherapy and drug development.

Keywords: Biomarker; Gene set variation analysis (GSVA); Immune infiltration; Machine learning; Major depression disorder (MDD).

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests. Ethics & inclusion statement: This study was conducted in accordance with the principles set out in the Global Code of Conduct for Research in Resource-Poor Settings. All data used in this research were obtained from publicly available databases. We ensured that our research objectives aligned with local and global health priorities. In our analysis and interpretation of the data, we were mindful of potential biases and strived to avoid generalizations that could perpetuate stereotypes. We have made efforts to ensure that our findings are accessible to the scientific community and, where applicable, to stakeholders in resource-poor settings who may benefit from this research. All authors contributed to and approved the final manuscript, and we have appropriately acknowledged the contributions of all collaborators and data sources. We are committed to the fair and equitable sharing of benefits that may arise from this research.

Figures

Fig. 1
Fig. 1
The gene set variation analysis (GSVA) for MDD groups and control groups based on the GSE98793 database. (A) GSVA scoring bar graph based on “GO” enrichment pathway; (B) GSVA scoring bar graph based on “KEGG enrichment pathway; (C) GSVA scoring bar graph based on “Reactome pathway; (D) GSVA scoring bar graph based on “KEGG metabolism pathway GSVA score bar graphs. Blue represents upregulated terms and green represents down-regulated terms.
Fig. 2
Fig. 2
Analysis of seven immune infiltration methods in the MDD and control groups based on the GSE98793 database. (A) Heat map with relative abundance differences of different immune cell types; (B) distribution of specific immune cell subpopulations violin plot.
Fig. 3
Fig. 3
Differentially expressed gene analysis based on five Datasets GSE98793, GSE32280, GSE38206, GSE39653, and GSE52790. (A-E) Volcano plots of differentially expressed genes between MDD patients and healthy individuals in the five datasets; where genes located above the threshold line (P = 0.05) are significant, with red dots representing significantly up-regulated genes and blue dots representing significantly down-regulated genes; (F) heatmap of 23 hub genes, with the color scale indicating the level of expression, with red representing higher expression and blue representing lower expression; (G) GO-BP enrichment analysis of hub genes; (H) KEGG pathway analysis of hub genes.
Fig. 4
Fig. 4
Machine learning algorithm construction of MDD diagnostic model. (A) Identification of MDD feature genes using 113 machine learning algorithms; (B) Box plots of expression levels of 6 key genes; (C) Residual plot of MDD regression model; (D) Cook’s distance test plot of MDD regression model; (E) Bar plot of coefficients in MDD regression model; (F) Bar plot of MDD regression model scores for MDD patients and normal individuals; (G) ROC curves of the five datasets and the combined dataset (meta).
Fig. 5
Fig. 5
GSVA scores for high-risk and low-risk groups of MDD. (A) Bar plot of GSVA scores based on “GO” enriched pathways; (B) Bar plot of GSVA scores based on “KEGG enriched pathways; (C) Bar plot of GSVA scores based on “Reactome pathways; (D) Bar plot of GSVA scores based on “KEGG metabolic pathways. Blue indicates upregulated items, and green indicates downregulated items.
Fig. 6
Fig. 6
Immune infiltration analysis for high-risk and low-risk groups of MDD. (A) Heatmap showing the relative abundance differences of immune cell types; (B) Violin plot of differential immune cell levels.
Fig. 7
Fig. 7
MDD potential drug prediction. (A) Enrichment results and top five upregulated signaling pathways in the high-score MDD group using GSEA algorithm; (B) Top ten drugs that promote and inhibit MDD patients obtained from the L100 FWD database; (C) Potential drug correlation and differential analysis for high and low-score MDD patients selected from the CTRP dataset; (D) Potential drug correlation and differential analysis for high and low-score MDD patients selected from the PRISM dataset.

Similar articles

References

    1. Dean, J. & Keshavan, M. The neurobiology of depression: an integrated view. Asian J. Psychiatry. 27, 101–111 (2017). - PubMed
    1. Su, Y., Ye, C., Xin, Q. & Si, T. Major depressive disorder with suicidal ideation or behavior in Chinese population: A scoping review of current evidence on disease assessment, burden, treatment and risk factors. J. Affect. Disord.340, 732–742 (2023). - PubMed
    1. Barnett, R. & Depression Lancet (London England)393(10186), 2113 (2019). - PubMed
    1. Malhi, G. S., Mann, J. J. & Depression Lancet (London England)392(10161), 2299–2312 (2018). - PubMed
    1. Lu, J. et al. Prevalence of depressive disorders and treatment in China: a cross-sectional epidemiological study. Lancet Psychiatry. 8 (11), 981–990 (2021). - PubMed

MeSH terms

LinkOut - more resources