Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2025 Jul:187:123-133.
doi: 10.1016/j.jpsychires.2025.05.015. Epub 2025 May 6.

Prediction of post stroke depression with machine learning: A national multicenter cohort study

Affiliations
Multicenter Study

Prediction of post stroke depression with machine learning: A national multicenter cohort study

Yumeng Gu et al. J Psychiatr Res. 2025 Jul.

Abstract

Objective: Post-stroke depression (PSD) is a common psychiatric complication following stroke, with low clinical detection rates and delayed diagnosis. Most existing PSD prediction models suffer from incomplete data inclusion, which limits their clinical predictive value. This study aims to integrate multimodal data, including clinical characteristics, biomarkers, and neuroimaging variables, to validate the potential of machine learning models in efficiently identifying high-risk PSD patients.

Methods: This study is based on a multicenter clinical follow-up cohort of patients with acute ischemic stroke (AIS) in China, conducted from December 2020 to September 2023. Predictive factors included demographic characteristics, clinical features, and previously identified neuroimaging variables associated with PSD. The primary outcome was the occurrence of PSD within 3-6 months after stroke. The dataset was divided into a training set and a test set at a 3:1 ratio, with further validation performed using an external dataset. Four machine learning models-Adaptive Boosting, Gradient Boosting Decision Tree (GBDT), Quadratic Discriminant Analysis, and Multilayer Perceptron Classifier-were implemented using Python. Their predictive performance was compared based on accuracy metrics.

Results: A total of 4298 AIS patients (mean age: 68.33 ± 8.82 years, 46.4 % male) were included, among whom 1483 developed PSD. In the test dataset, the GBDT model achieved an area under the curve (AUC) of 0.8626, accuracy of 0.7833, sensitivity of 0.8085, specificity of 0.5296, and an F1-score of 0.6396, outperforming other models. In the external validation set, the GBDT model also demonstrated superior performance, with an AUC of 0.8185, accuracy of 0.8636, sensitivity of 0.8846, specificity of 0.5285, and an F1-score of 0.6689. The most important predictors of PSD included National Institutes of Health Stroke Scale (NIHSS) at discharge, left-sided lesions, lacunar infarcts (LIs), homocysteine (HCY) levels, and systolic blood pressure (SBP).

Conclusion: The machine learning model performs well in predicting PSD. Clinicians should focus on stroke patients with high NIHSS scores, left-sided lesions, LIs, elevated HCY level, and high SBP to develop personalized and precise management and treatment strategies for high-risk PSD patients, aiming to prevent or delay PSD onset.

Keywords: Interpretability; Machine learning; Post stroke depression; Predictive model.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Publication types

LinkOut - more resources