Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 24;22(1):1144.
doi: 10.1186/s12967-024-05982-2.

Predicting higher risk factors for COVID-19 short-term reinfection in patients with rheumatic diseases: a modeling study based on XGBoost algorithm

Affiliations

Predicting higher risk factors for COVID-19 short-term reinfection in patients with rheumatic diseases: a modeling study based on XGBoost algorithm

Yao Liang et al. J Transl Med. .

Abstract

Background: Corona virus disease 2019 (COVID-19) reinfection, particularly short-term reinfection, poses challenges to the management of rheumatic diseases and may increase adverse clinical outcomes. This study aims to develop machine learning models to predict and identify the risk of short-term COVID-19 reinfection in patients with rheumatic diseases.

Methods: We developed four prediction models using explainable machine learning to assess the risk of short-term COVID-19 reinfection in 543 patients with rheumatic diseases. Psychological health was evaluated using the Functional Assessment of Chronic Illness Therapy Fatigue (FACIT-F) scale, the Patient Health Questionnaire-9 (PHQ-9), the Generalized Anxiety Disorder 7-item (GAD-7) questionnaire, and the Pittsburgh Sleep Quality Index (PSQI) scale. Health status and disease activity were assessed using the EuroQol-5 Dimension-3 Level (EQ-5D-3L) descriptive system and the Visual Analogue Score (VAS) scale. The model performance was assessed by Area Under the Receiver Operating Characteristic Curve (AUC), Area Under the Precision-Recall Curve (AUPRC), and the geometric mean of sensitivity and specificity (G-mean). SHapley Additive exPlanations (SHAP) analysis was used to interpret the contribution of each predictor to the model outcomes.

Results: The eXtreme Gradient Boosting (XGBoost) model demonstrated superior performance with an AUC of 0.91 (95% CI 0.87-0.95). Significant factors of short-term reinfection included glucocorticoid taper (OR = 2.61, 95% CI 1.38-4.92), conventional synthetic disease-modifying antirheumatic drugs (csDMARDs) taper (OR = 2.97, 95% CI 1.90-4.64), the number of symptoms (OR = 1.24, 95% CI 1.08-1.42), and GAD-7 scores (OR = 1.07, 95% CI 1.02-1.13). FACIT-F scores were associated with a lower likelihood of short-term reinfection (OR = 0.95, 95% CI 0.93-0.96). Besides, we found that the GAD-7 score was one of the most important predictors.

Conclusion: We developed explainable machine learning models to predict the risk of short-term COVID-19 reinfection in patients with rheumatic diseases. SHAP analysis highlighted the importance of clinical and psychological factors. Factors included anxiety, fatigue, depression, poor sleep quality, high disease activity during initial infection, and the use of glucocorticoid taper were significant predictors. These findings underscore the need for targeted preventive measures in this patient population.

Keywords: COVID-19; Machine learning; Psychological factors; Rheumatism; SHAP analysis; Short-term reinfection; XGBoost.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The participants provided their written informed consent to participate in this study. Aligned with the Helsinki Declaration, ethical approval of the Institutional Review Board (IRB) for the current study was obtained from the third affiliated hospital of Sun Yat-sen university ethical committee (Number: II2023-090–02). Consent for publication: Not applicable. Competing interests: The authors have no competing interests to disclose.

Figures

Fig. 1
Fig. 1
Flowchart for the overall study design and modeling analysis steps
Fig. 2
Fig. 2
SHAP importance and model performance plots. A Represents the mean absolute SHAP values for each predictive variable. B Represents the distribution of SHAP values for predictive variables. X-axis showed the direction of the importance score (negative or positive). Colors change from low values (dark blue) to high values (yellow). C ROC curves, represents the model performance on the classification. D Precision-recall curves
Fig. 3
Fig. 3
Survival curves and GAM trajectories plots. A and B represent the probability curves for short-term reinfection for the total dataset and sex subgroups, respectively. CH represent estimated probability of reinfection based on changes of FACIT-F, PHQ-9, GAD-7, PSQI, patient self-report outcome (previous infection), patient self-report outcome (current), respectively. Lightblue part means the 95% CI for each blue trajectories line

Similar articles

References

    1. Van Elslande J, Vermeersch P, Vandervoort K, et al. Symptomatic Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) reinfection by a phylogenetically distinct strain. Clin Infect Dis. 2021;73:354. - PMC - PubMed
    1. Edridge AWD, Kaczorowska J, Hoste ACR, et al. Seasonal coronavirus protective immunity is short-lasting. Nat Med. 2020;26:1691. - PubMed
    1. Zhang M, Cao L, Zhang L, et al. SARS-CoV-2 reinfection with Omicron variant in Shaanxi Province, China: December 2022 to February 2023. BMC Public Health. 2024;24:496. - PMC - PubMed
    1. Cai C, Li Y, Hu T, et al. The associated factors of SARS-CoV-2 reinfection by omicron variant - Guangdong Province, China, December 2022 to January 2023. China CDC Wkly. 2023;5:391–6. - PMC - PubMed
    1. Liu D, Chen B, Liao X, et al. Specific persistent symptoms of COVID-19 and associations with reinfection: a community-based survey study in southern China. Front Public Health. 2024. 10.3389/fpubh.2024.1452233. - PMC - PubMed

LinkOut - more resources