Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 16:13:1495794.
doi: 10.3389/fpubh.2025.1495794. eCollection 2025.

Constructing a screening model to identify patients at high risk of hospital-acquired influenza on admission to hospital

Affiliations

Constructing a screening model to identify patients at high risk of hospital-acquired influenza on admission to hospital

Shangshu Zhang et al. Front Public Health. .

Abstract

Objective: To develop a machine learning (ML)-based admission screening model for hospital-acquired (HA) influenza using routinely available data to support early clinical intervention.

Methods: The study focused on hospitalized patients from January 2021 to May 2024. The case group consisted of patients with HA influenza, while the control group comprised non-HA influenza patients admitted to the same ward in the HA influenza unit within 2 weeks. The 953 subjects were divided into the training set and the validation set in a 7:3 ratio. Feature screening was performed using least absolute shrinkage and selection operator (LASSO) and the Boruta algorithm. Subsequently eight ML algorithms were applied to analyze and identify the optimal model using a 5-fold cross-validation methodology. And the area under the curve (AUC), area under the precision-recall curve (AP), F1 score, calibration curve and decision curve analysis (DCA) were applied to comprehensively assess the predictive effectiveness of the selected models. Feature factors were selected and feature importance's were assessed using SHapley's additive interpretation (SHAP). Furthermore, an interactive web-based platform was additionally developed to visualize and demonstrate the predictive model.

Results: Age, pneumonia on admission, Chronic renal failure, Malignant tumor, hypoproteinemia, glucocorticoid use, admission to ICU, lymphopenia, BMI were identified as key variables. For the eight ML algorithms, ROC values ranging from 0.548 to 0.812 were observed in the validation set. A comprehensive analysis showed that the XGBoost model predicted the highest accuracy (AUC: 0.812) with an F1 score of 0.590 and the highest A p value (0.655). Evaluating the optimal model, the AUC values were 0.995, 0.826, and 0.781 for the training, validation and test sets. The XGBoost model showed strong robust. SHapley's additive interpretation (SHAP) was utilized to analyze the contribution of explanatory variables to the model and their correlation with HA influenza. In addition, we developed a practical online prediction tool to calculate the risk of HA influenza occurrence.

Conclusion: Based on the routine data, the XGBoost model demonstrated excellent calibration among all ML algorithms and accurately predicted the risk of HA influenza, thereby serving as an effective tool for early screening of HA influenza.

Keywords: SHAP (SHapley’s additive explanation); hospital-acquired influenza; machine learning; practical tool; prediction model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Screening process of feature variables from LASSO regression analysis and Boruta algorithm. (a,b) Factor screening based on the LASSO regression model, with the left dashed line indicating the best lambda value for the evaluation metrics (lambda.min) and the right dashed line indicating the lambda value for the model where the evaluation metrics are in the range of the best value by one standard error (lambda.1se); (c) Boruta algorithm screening variable trajectories; (d) The common subset of Boruta and LASSO.
Figure 2
Figure 2
Construction and comparison of multiple ML algorithms models. (a) The ROC curve analysis in training set. (b) The ROC curve analysis in validation set. (c) Calibration curve of ML models in validation sets. (d) PR curves of ML models in validation sets.
Figure 3
Figure 3
Construction and evaluation of XGBoost model. (a–c) ROC curve, including training set (a), validation set (b), and test set (c); (d) XGBoost classifier learning curve; (e) Calibration curve of the model; (f) DCA diagram of the model; (g) Confounding matrix for the training set; (h) Confounding matrix for the test set.
Figure 4
Figure 4
SHAP analysis of the XGBoost model. (a) SHAP dendrogram of features. (b) Importance ranking plot of features. (c,d) Interpretability analysis of 2 independent samples.
Figure 5
Figure 5
Online prediction model for HA influenza and individual patient risk presentation.

Similar articles

References

    1. Salmanton-García J, Wipfler P, Leckler J, Nauclér P, Mallon PW, Bruijning-Verhagen PCJL, et al. . Predicting the next pandemic: VACCELERATE ranking of the World Health Organization's blueprint for action to Prevent Epidemics. Travel Med Infect Dis. (2024) 57:102676. doi: 10.1016/j.tmaid.2023.102676, PMID: - DOI - PubMed
    1. Deng LL, Han YJ, Li ZW, Wang DY, Chen T, Ren X, et al. . Epidemiological characteristics of seven notifiable respiratory infectious diseases in the mainland of China: an analysis of national surveillance data from 2017 to 2021. Infect Dis Poverty. (2023) 12:99. doi: 10.1186/s40249-023-01147-3, PMID: - DOI - PMC - PubMed
    1. 2023-2024 U.S . Flu season: Preliminary in-season burden estimates. Centers for Disease Control and Prevention. (2024). Available at: https://www.cdc.gov/flu-burden/php/data-vis/2023-2024.html.
    1. Li Y, Wang LL, Xie LL, Hou WL, Liu XY, Yin S. The epidemiological and clinical characteristics of the hospital-acquired influenza infections: a systematic review and meta-analysis. Medicine. (2021) 100:e25142. doi: 10.1097/MD.0000000000025142, PMID: - DOI - PMC - PubMed
    1. Huzly D, Kurz S, Ebner W, Dettenkofer M, Panning M. Characterisation of nosocomial and community-acquired influenza in a large university hospital during two consecutive influenza seasons. J Clin Virol. (2015) 73:47–51. doi: 10.1016/j.jcv.2015.10.016, PMID: - DOI - PMC - PubMed

LinkOut - more resources