. 2025 Mar 13:12:1529993.

doi: 10.3389/fmed.2025.1529993. eCollection 2025.

Explainable machine learning model and nomogram for predicting the efficacy of Traditional Chinese Medicine in treating Long COVID: a retrospective study

Jisheng Zhang¹, Yang Chen¹, Aijun Zhang², Yi Yang³, Liqian Ma¹, Hangqi Meng¹, Jintao Wu¹, Kean Zhu¹, Jiangsong Zhang¹, Ke Lin⁴, Xianming Lin¹

Affiliations

¹ The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China.
² Jiaxing First People's Hospital, Jiaxing, China.
³ Haining People's Hospital, Jiaxing, China.
⁴ The First Clinical Medical College of Zhejiang Chinese Medicine University, Hangzhou, China.

PMID: 40182854
PMCID: PMC11966431
DOI: 10.3389/fmed.2025.1529993

Explainable machine learning model and nomogram for predicting the efficacy of Traditional Chinese Medicine in treating Long COVID: a retrospective study

Jisheng Zhang et al. Front Med (Lausanne). 2025.

. 2025 Mar 13:12:1529993.

doi: 10.3389/fmed.2025.1529993. eCollection 2025.

Authors

Jisheng Zhang¹, Yang Chen¹, Aijun Zhang², Yi Yang³, Liqian Ma¹, Hangqi Meng¹, Jintao Wu¹, Kean Zhu¹, Jiangsong Zhang¹, Ke Lin⁴, Xianming Lin¹

Affiliations

¹ The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China.
² Jiaxing First People's Hospital, Jiaxing, China.
³ Haining People's Hospital, Jiaxing, China.
⁴ The First Clinical Medical College of Zhejiang Chinese Medicine University, Hangzhou, China.

PMID: 40182854
PMCID: PMC11966431
DOI: 10.3389/fmed.2025.1529993

Abstract

Introduction: Long COVID significantly affects patients' quality of life, yet no standardized treatment has been established. Traditional Chinese Medicine (TCM) presents a promising potential approach with targeted therapeutic strategies. This study aims to develop an explainable machine learning (ML) model and nomogram to identify Long COVID patients who may benefit from TCM, enhancing clinical decision-making.

Methods: We analyzed data from 1,331 Long COVID patients treated with TCM between December 2022 and February 2024 at three hospitals in Zhejiang, China. Effectiveness was defined as improvement in two or more symptoms or a minimum 2-point increase in the Traditional Chinese Medicine Syndrome Score (TCMSS). Data included 11 patient and disease characteristics, 18 clinical symptoms and syndrome scores, and 12 auxiliary examination indicators. The least absolute shrinkage and selection operator (LASSO) method identified features linked to TCM efficacy. Data from 1,204 patients served as the training set, while 127 patients formed the testing set.

Results: We employed five ML algorithms: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), and Neural Network (NN). The XGBoost model achieved an Area Under the Curve (AUC) of 0.9957 and an F1 score of 0.9852 in the training set, demonstrating superior performance in the testing set with an AUC of 0.9059 and F1 score of 0.9027. Key features identified through SHapley Additive exPlanations (SHAP) included chest tightness, aversion to cold, age, TCMSS, Short Form (36) Health Survey (SF-36), C-reactive protein (CRP), and lymphocyte ratio. The logistic regression-based nomogram demonstrated an AUC of 0.9479 and F1 score of 0.9384 in the testing set.

Conclusion: This study utilized multicenter data and multiple ML algorithms to create a ML model for predicting TCM efficacy in Long COVID treatment. Furthermore, a logistic regression-based nomogram was developed to assist the model and improve decision-making efficiency in TCM applications for Long COVID management.

Keywords: Long COVID; SHapley Additive exPlanations; Traditional Chinese Medicine; efficacy; machine learning; nomogram.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Flowchart of this study. TCM, Traditional Chinese Medicine; LASSO, Least Absolute Shrinkage and Selection Operator; SVM, Support Vector Machine; RF, Random Forest; KNN, K-Nearest Neighbors; XGBoost, Extreme Gradient Boosting; NN, Neural Network; AUC, Area Under the Curve; SHAP, SHapley Additive exPlanations; ML, machine learning.

**Figure 2**
The results of the ROC curve, AUC value, and correlation analysis. **(A)** Receiver Operating Characteristic (ROC) curve plot; **(B)** Area Under the Curve (AUC) bar chart; **(C)** Correlation matrix heatmap. BMI, body mass index; TCMSS, Traditional Chinese Medicine syndrome score; SF-36, Short Form (36) Health Survey; PSQI, Pittsburgh Sleep Quality Index; CRP, C-reactive protein; WBC, White blood cells; RBC, Red blood cells; AST/ALT, the ratio of Aspartate Aminotransferase to Alanine Aminotransferase. *p < 0.05, **p < 0.01, ***p < 0.001.

**Figure 3**
The results of Lasso regression. **(A)** LASSO Coefficient Path Plot; **(B)** Cross-Validation Error Plot for LASSO.

**Figure 4**
Combined ROC Curves for Multiple Machine Learning Models. **(A)** ROC curves for training sets; **(B)** ROC curves for testing sets. KNN, K-Nearest Neighbors; NN, Neural Network; RF, Random Forest; SVM, Support Vector Machine; XGBost, Extreme Gradient Boosting.

**Figure 5**
SHAP value of each feature in the model. **(A)** SHAP feature importance shown according to the mean absolute SHAP value of each feature; **(B)** SHAP summary plot showing the distribution of the SHAP values of each feature. SF-36, Short Form (36) Health Survey; TCMSS, Traditional Chinese Medicine syndrome score; CRP, C-reactive protein.

**Figure 6**
SHAP dependence plots of continuous features in the model. **(A)** Short Form (36) Health Survey (SF-36), **(B)** Traditional Chinese Medicine (TCM) syndrome score, **(C)** C-reactive protein (CRP), **(D)** age and **(E)** Lymphocyte ratio. The y-axis represents the SHAP values of features, and the values of certain features are shown in the x-axis, continuous variables were standardized using the min–max scaling method, resulting in values between 0 and 1. Each dot represents a SHAP value for a feature per patient, and color from light to dark represents the feature's value from high to low. SHAP values for specific features exceeding zero represent an increased probability of Traditional Chinese Medicine being effective in treating long COVID. SF-36, Short Form (36) Health Survey; TCMSS, Traditional Chinese Medicine syndrome score; CRP, C-reactive protein.

**Figure 7**
Patient-level SHAP force plots. **(A)** True positive patient, **(B)** True negative patient. The color represents the contributions of each feature, with red being positive and blue being negative. The length of the color bar represents the contribution strength.

**Figure 8**
Nomogram for logistic regression. SF-36, Short Form (36) Health Survey; TCMSS, Traditional Chinese Medicine syndrome score; CRP, C-reactive protein.

See this image and copyright information in PMC

References

1. Soriano JB, Murthy S, Marshall JC, Relan P, Diaz JV. WHO Clinical Case Definition Working Group on Post-COVID-19 Condition. A clinical case definition of post-COVID-19 condition by a Delphi consensus. Lancet Infect Dis. (2022) 22:e102–7. 10.1016/S1473-3099(21)00703-9 - DOI - PMC - PubMed
1. Davis HE, McCorkell L, Vogel JM, Topol EJ. Long COVID: major findings, mechanisms and recommendations. Nat Rev Microbiol. (2023) 21:133–46. 10.1038/s41579-022-00846-2 - DOI - PMC - PubMed
1. Ford ND. Long COVID and significant activity limitation among adults, by age—United States, June 1–13, 2022, to June 7–19, 2023. MMWR Morb Mortal Wkly Rep. (2023) 72:866–70. 10.15585/mmwr.mm7232a3 - DOI - PMC - PubMed
1. Klein J, Wood J, Jaycox JR, Dhodapkar RM, Lu P, Gehlhausen JR, et al. . Distinguishing features of long COVID identified through immune profiling. Nature. (2023) 623:139–48. 10.1038/s41586-023-06651-y - DOI - PMC - PubMed
1. Greenhalgh T, Sivan M, Perlowski A, Nikolich JŽ. Long COVID: a clinical update. Lancet. (2024) 404:707–24. 10.1016/S0140-6736(24)01136-X - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explainable machine learning model and nomogram for predicting the efficacy of Traditional Chinese Medicine in treating Long COVID: a retrospective study

Affiliations

Explainable machine learning model and nomogram for predicting the efficacy of Traditional Chinese Medicine in treating Long COVID: a retrospective study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous