Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 14;25(1):186.
doi: 10.1186/s12911-025-03020-9.

Development and application of an early prediction model for risk of bloodstream infection based on real-world study

Affiliations

Development and application of an early prediction model for risk of bloodstream infection based on real-world study

Xiefei Hu et al. BMC Med Inform Decis Mak. .

Abstract

Background: Bloodstream Infection (BSI) is a severe systemic infectious disease that can lead to sepsis and Multiple Organ Dysfunction Syndrome (MODS), resulting in high mortality rates and posing a major public health burden globally. Early identification of BSI is crucial for effective intervention, reducing mortality, and improving patient outcomes. However, existing diagnostic methods are flawed by low specificity, long detection times and high demands on testing platforms. The development of artificial intelligence provides a new approach for early disease identification. This study aims to explore the optimal combination of routine laboratory data and clinical monitoring indicators, and to utilize machine learning algorithms to construct an early, rapid, and universally applicable BSI risk prediction model, to assist in the early diagnosis of BSI in clinical practice.

Methods: Clinical data of 2582 suspected BSI patients admitted to the Chongqing University Central Hospital, from January 1, 2021 to December 31, 2023 were collected for this study. The data were divided into a modeling dataset and an external validation dataset based on chronological order, while the modeling dataset was further divided into a training set and an internal validation set. The occurrence rate of BSI, distribution of pathogens, and microbial primary reporting time were analyzed within the training set. During the feature selection stage, univariate regression and ML algorithms were applied. First, Univariate logistic regression was used to screen for predictive factors of BSI. Then, the Boruta algorithm, Lasso regression, and Recursive Feature Elimination with Cross-validation (RFE-CV) were employed to determine the optimal combination of predictors for predicting BSI. Based on the optimal combination, six machine learning algorithms were used to construct an early BSI risk prediction model. The best model was selected by models' performance, and the Shapley Additive Explanations (SHAP) method was used to explain the model. The external validation set was used to evaluate the predictive performance and generalizability of the selected model, and the research findings were ultimately applied in clinical practice.

Results: The incidence of BSI among inpatients at the Chongqing University Central Hospital was 12.91%. Following further feature selection, a set of 5 variables was determined, including white blood cell count, standard bicarbonate, base excess of extracellular fluid, interleukin-6, and body temperature. BSI early risk prediction models were constructed using six machine learning algorithms, with the XGBoost model demonstrating the best performance, achieving an AUC value of 0.782 in the internal validation set and an AUC value of 0.776 in the external validation set. This model is made publicly available as an online webpage tool for clinical use.

Conclusions: This study successfully identified a set of 5 features by analyzing routine laboratory data clinical monitoring indicators among hospitalized patients. Based on this set, a machine learning-based early risk prediction model for BSI was constructed. The model is capable of early and rapid differentiation between BSI and non-BSI patients. The inclusion of minimal risk prediction factors enhances its applicability in clinical settings, particularly at the primary care level. To further improve the model's real-world applicability and more convenient for clinical use, the online application of the model could greatly improve the efficiency of BSI diagnosis and reducing patients' mortality.

Keywords: Bloodstream infection; Model construction; Real-world; Risk prediction.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The studies involving humans were approved by the Ethics Committee of Chongqing Emergency Medical Center and Chongqing University Central Hospital (Approval Ethics Review No.RS202410). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. The research protocol was approved by the institutional review board and adhered to the ethical guidelines of the Helsinki Declaration. Consent for the publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flow chart depicting number of patients who were included in analysis after exclusion criteria. The total included encounters were divided into those with and without BSI
Fig. 2
Fig. 2
The ROC curves of predictive factors identified by univariate logistic regression analysis
Fig. 3
Fig. 3
Selection of key features for BSI. (a) Variable Selection Plot of Boruta; (b) Variable Selection Plot of Lasso; (c) Variable Selection Plot of RFE-CV; (d)Venn graph displaying 5 features shared by Boruta, Lasso and RFE-CV
Fig. 4
Fig. 4
ROC curves of six models in the internal validation set
Fig. 5
Fig. 5
Performance evaluation of the XGBoost model. (a) ROC curve of external validation set in the XGBoost model; (b) calibration curve of XGBoost model
Fig. 6
Fig. 6
Model Interpretation of XGBoost. (a) Importance ranking of features; (b) Example of Low-risk Patient; (c) Example of hight-risk patient

Similar articles

References

    1. Lamy B, Sundqvist M, Idelevich EA. Bloodstream infections - Standard and progress in pathogen diagnostics. Clin Microbiol Infect. 2020;26(2):142–50. - PubMed
    1. Shanghai Society for Microbiology, Clinical Microbiology Professional Committee, Shanghai Medical Association, Critical Care Medicine Specialty Branch, Shanghai Medical Association. Critical care medicine specialty branch. Expert consensus on clinical laboratory testing pathways for bloodstream infections. Chin J Infect Dis. 2022;40(08):457–75.
    1. Vincent JL, Sakr Y, Singer M, et al. Prevalence and outcomes of infection among patients in intensive care units in 2017. JAMA. 2020;323(15):1478–87. - PMC - PubMed
    1. Lin K, Zhang HC, Zhao YH, et al. The direct application of plasma droplet digital PCR in the ultra-early pathogen detection and warning during sepsis: case reports. J Infect Public Health. 2022;15(4):450–4. - PubMed
    1. Rudd KE, Johnson SC, Agesa KM, et al. Global, regional, and National sepsis incidence and mortality, 1990–2017: analysis for the global burden of disease Study. Lancet (London England). 2020;395(10219):200–11. - PMC - PubMed

LinkOut - more resources