Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 9;10(8):e29372.
doi: 10.1016/j.heliyon.2024.e29372. eCollection 2024 Apr 30.

Differentiating viral and bacterial infections: A machine learning model based on routine blood test values

Affiliations

Differentiating viral and bacterial infections: A machine learning model based on routine blood test values

Gregor Gunčar et al. Heliyon. .

Abstract

The growing threat of antibiotic resistance necessitates accurate differentiation between bacterial and viral infections for proper antibiotic administration. In this study, a Virus vs. Bacteria machine learning model was developed to distinguish between these infection types using 16 routine blood test results, C-reactive protein concentration (CRP), biological sex, and age. With a dataset of 44,120 cases from a single medical center, the model achieved an accuracy of 82.2 %, a sensitivity of 79.7 %, a specificity of 84.5 %, a Brier score of 0.129, and an area under the ROC curve (AUC) of 0.905, outperforming a CRP-based decision rule. Notably, the machine learning model enhanced accuracy within the CRP range of 10-40 mg/L, a range where CRP alone is less informative. These results highlight the advantage of integrating multiple blood parameters in diagnostics. The "Virus vs. Bacteria" model paves the way for advanced diagnostic tools, leveraging machine learning to optimize infection management.

PubMed Disclaimer

Conflict of interest statement

Marko Notar is the CEO of Smart Blood Analytics SA. Mateja Notar, Sašo Moškon, Tim Smole, Žiga Osterc, Marjeta Tušek Jelenc and Manca Köster hold positions at Smart Blood Analytics Swiss SA. Matjaž Kukar, Peter Černelč, and Gregor Gunčar serve as advisors to Smart Blood Analytics Swiss SA. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flow diagram of cases included in the study. Unlabeled cases were used only for training the diagnostic model. All reported results were obtained using the labeled cases only.
Fig. 2
Fig. 2
Semi-supervised bootstrap labeling of unlabeled cases.
Fig. 3
Fig. 3
Violin plots of blood parameters for visual comparisons of ‘Bacteria’ and ‘Virus’ populations. Most parameters exhibit considerable perceptive differences between the populations. For this visualization, all cases from the training set were included. The red vertical dashed lines represent reference intervals. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 4
Fig. 4
Visualization of the parameter space with the UMAP method. Each dot represents a single blood test or, more specifically, an embedding of all blood parameters into a two-dimensional space, and its color represents the infection type. Blue dots represent blood tests with viral infections, and orange dots represent blood tests with bacterial infections. The visualization shows all the labeled cases from the training set. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 5
Fig. 5
Performance of the Virus vs. Bacteria model (model) and simple CRP decision rule (CRP) (A) on all cases from 10-fold cross-validation, (B) on all cases from 10-fold cross-validation that have CRP values within the region of interest between 10 and 40 mg/L.
Fig. 6
Fig. 6
Importance of the top six parameters as a function of different CRP concentration ranges. The vertical dashed lines denote the CRP range of 10–40 mg/L.
Fig. 7
Fig. 7
ROC curves of the Virus vs. Bacteria model (model) and simple CRP decision rule (CRP) (A) on all evaluation cases, (B) on all evaluation cases that have CRP values within the region of interest between 10 and 40 mg/L. For the model, a default operating point (0.5) is visualized; for the CRP decision rule, an operating point of 24 mg/L is displayed.
Fig. 8
Fig. 8
Distribution of Shapley values across the entire dataset, which can be used to identify the features that have the greatest impact on the model's predictions. Each dot represents a case in the dataset, and its position along the x-axis corresponds to the Shapley value for a particular blood parameter. The color of the dot indicates the value of the feature for that instance, with blue indicating a low value and red indicating a high value. The position of the dot indicates its impact on the prediction: dots to the left of the zero vertical line indicate a negative impact, while dots to the right indicate a positive impact. The larger the absolute value of the Shapley value, the larger the impact of the corresponding feature on the prediction. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

References

    1. Laxminarayan R., Duse A., Wattal C., Zaidi A.K.M., Wertheim H.F.L., Sumpradit N., Vlieghe E., Hara G.L., Gould I.M., Goossens H., Greko C., So A.D., Bigdeli M., Tomson G., Woodhouse W., Ombaka E., Peralta A.Q., Qamar F.N., Mir F., Kariuki S., Bhutta Z.A., Coates A., Bergstrom R., Wright G.D., Brown E.D., Cars O. Antibiotic resistance—the need for global solutions. Lancet Infect. Dis. 2013;13:1057–1098. doi: 10.1016/S1473-3099(13)70318-9. - DOI - PubMed
    1. WHO . 2015. Global Action Plan on Antimicrobial Resistance.
    1. Ventola C.L. The antibiotic resistance crisis: part 1: causes and threats. Pharmacol. Ther. 2015;40:277. - PMC - PubMed
    1. Chan Y.-L., Liao H.-C., Tsay P.-K., Chang S.-S., Chen J.-C., Liaw S.-J. C-reactive protein as an indicator of bacterial infection of adult patients in the emergency department. Chang Gung Med. J. 2002;25:437–445. - PubMed
    1. Hoeboer S.H., Van Der Geest P.J., Nieboer D., Groeneveld A.B.J. The diagnostic accuracy of procalcitonin for bacteraemia: a systematic review and meta-analysis. Clin. Microbiol. Infect. 2015;21:474–481. doi: 10.1016/j.cmi.2014.12.026. - DOI - PubMed