. 2025 Jul 28:15:1605485.

doi: 10.3389/fcimb.2025.1605485. eCollection 2025.

Prediction of bacteremia using routine hematological and metabolic parameters based on logistic regression and random forest models

Ting-Qiang Wang¹, Ying Zhuo¹, Chun-E Lv¹, Jing Shi², Ling-Hui Yao¹, Shi-Yan Zhang¹, Jinbao Shi³

Affiliations

¹ Department of Clinical Laboratory, Fuding Hospital, Fujian University of Traditional Chinese Medicine, Fuding, Fujian, China.
² Department of Anesthesiology, Fuding Hospital, Fujian University of Traditional Chinese Medicine, Fuding, Fujian, China.
³ Department of Nephrology, Ningde Hospital of Traditional Chinese Medicine, Ningde, Fujian, China.

PMID: 40792110
PMCID: PMC12336153
DOI: 10.3389/fcimb.2025.1605485

Prediction of bacteremia using routine hematological and metabolic parameters based on logistic regression and random forest models

Ting-Qiang Wang et al. Front Cell Infect Microbiol. 2025.

. 2025 Jul 28:15:1605485.

doi: 10.3389/fcimb.2025.1605485. eCollection 2025.

Authors

Ting-Qiang Wang¹, Ying Zhuo¹, Chun-E Lv¹, Jing Shi², Ling-Hui Yao¹, Shi-Yan Zhang¹, Jinbao Shi³

Affiliations

¹ Department of Clinical Laboratory, Fuding Hospital, Fujian University of Traditional Chinese Medicine, Fuding, Fujian, China.
² Department of Anesthesiology, Fuding Hospital, Fujian University of Traditional Chinese Medicine, Fuding, Fujian, China.
³ Department of Nephrology, Ningde Hospital of Traditional Chinese Medicine, Ningde, Fujian, China.

PMID: 40792110
PMCID: PMC12336153
DOI: 10.3389/fcimb.2025.1605485

Abstract

Background: This study aimed to evaluate the predictive utility of routine hematological, inflammatory, and metabolic markers for bacteremia and to compare the classification performance of logistic regression and random forest models.

Methods: A retrospective study was conducted on 287 inpatients who underwent blood culture testing at Fuding Hospital, Fujian University of Traditional Chinese Medicine between March and August 2024. Patients were divided into bacteremia (n = 137) and non-bacteremia (n = 150) groups based on blood culture results. Hematological indices, inflammatory markers (e.g., C-reactive protein (CRP), procalcitonin (PCT)), metabolic indices (e.g., glucose, cholesterol) and nutritional markers (e.g., albumin) were analyzed. Univariate and multivariate binary logistic regression analyses were used to identify independent risk factors. Logistic regression and random forest models were developed using 33 features with a 70:30 train-test split and evaluated using the receiver operating characteristic (ROC) curves, confusion matrices and standard classification.

Results: Hemoglobin, cholesterol, and albumin levels were significantly lower in the bacteremia group, while platelet count, CRP, PCT, glucose, and triglycerides were significantly elevated (all p < 0.05). Logistic regression identified platelet count (Odds ratios (OR) = 1.003, 95% confidence interval (CI): 1.001-1.006), PCT (OR = 1.032, 95% CI: 1.004-1.060), triglycerides (OR = 1.740, 95% CI: 1.052-2.879), and low cholesterol (OR = 0.523, 95% CI: 0.383-0.714) as independent risk factors. The area under the ROC curve (AUC) was 0.75 for the random forest model and 0.74 for logistic regression, with recall rates of 0.69 and 0.60, respectively.

Conclusion: Routine laboratory markers integrated into machine learning models demonstrated potential for early bacteremia prediction. Random forest exhibited superior sensitivity compared to logistic regression, suggesting its potential utility as a clinical screening tool.

Keywords: bacteremia; biomarkers; blood culture; logistic regression; machine learning; random forest.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Confusion matrices of logistic regression and random forest models. This figure illustrates the confusion matrices for the logistic regression model (left) and the random forest model (right) on the test dataset. Compared with logistic regression, the random forest model achieved a slightly higher number of true positives (TP = 29 vs. 25) and fewer false negatives (FN = 13 vs. 17), indicating improved sensitivity. However, the random forest model also showed a modest increase in false positives (FP = 11 vs. 10), suggesting a slight reduction in specificity as a trade-off for higher sensitivity.

**Figure 2**
Comparison of ROC curves between logistic regression and random forest models. The ROC curves of the two models exhibit similar shapes, indicating that logistic regression and random forest achieved comparable classification performance on this dataset.

See this image and copyright information in PMC

References

1. Agnello L., Giglio R. V., Bivona G., Scazzone C., Gambino C. M., Iacona A., et al. (2021). The value of a complete blood count (CBC) for sepsis diagnosis and prognosis. Diagnostics (Basel) 11. doi: 10.3390/diagnostics11101881, PMID: - DOI - PMC - PubMed
1. Agnello L., Vidali M., Padoan A., Lucis R., Mancini A., Guerranti R., et al. (2024). Machine learning algorithms in sepsis. Clinica Chimica Acta 553, 117738. doi: 10.1016/j.cca.2023.117738, PMID: - DOI - PubMed
1. Allison S. P., Lobo D. N. (2024). The clinical significance of hypoalbuminaemia. Clin. Nutr. 43, 909–914. doi: 10.1016/j.clnu.2024.02.018, PMID: - DOI - PubMed
1. Chua M. T., Boon Y., Lee Z. Y., Kok J. H. J., Lim C. K. W., Cheung N. M. T., et al. (2025). The role of artificial intelligence in sepsis in the Emergency Department: a narrative review. Ann. Transl. Med. 13, 4. doi: 10.21037/atm-24-150, PMID: - DOI - PMC - PubMed
1. Evans L., Rhodes A., Alhazzani W., Antonelli M., Coopersmith C. M., French C., et al. (2021). Prescott HC et al: Surviving Sepsis Campaign: International Guidelines for Management of Sepsis and Septic Shock 2021. Crit. Care Med. 49, e1063–e1143. doi: 10.1097/CCM.0000000000005337, PMID: - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction of bacteremia using routine hematological and metabolic parameters based on logistic regression and random forest models

Affiliations

Prediction of bacteremia using routine hematological and metabolic parameters based on logistic regression and random forest models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous