Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar;24(1):24-42.
doi: 10.1177/1460458216656471. Epub 2016 Aug 4.

Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting

Affiliations

Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting

Claudia Ehrentraut et al. Health Informatics J. 2018 Mar.

Abstract

Hospital-acquired infections pose a significant risk to patient health, while their surveillance is an additional workload for hospital staff. Our overall aim is to build a surveillance system that reliably detects all patient records that potentially include hospital-acquired infections. This is to reduce the burden of having the hospital staff manually check patient records. This study focuses on the application of text classification using support vector machines and gradient tree boosting to the problem. Support vector machines and gradient tree boosting have never been applied to the problem of detecting hospital-acquired infections in Swedish patient records, and according to our experiments, they lead to encouraging results. The best result is yielded by gradient tree boosting, at 93.7 percent recall, 79.7 percent precision and 85.7 percent F1 score when using stemming. We can show that simple preprocessing techniques and parameter tuning can lead to high recall (which we aim for in screening patient records) with appropriate precision for this task.

Keywords: clinical decision-making; databases and data mining; ehealth; electronic health records; secondary care.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicting interests: The author(s) received no financial support for the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
A high-level flow chart describing this study’s text-classification approach for automatically detecting HAI. DR stands for daily patient record. In this study, a patient’s DR comprises data from four modules. All DRs of a patient together amount to the patient’s HR.
Figure 2.
Figure 2.
Top 20 feature importances for optimized GTB TF1000+stemming trained on the whole dataset. English translation within parenthesis. Note that since stemming is used the english translation is an approximation as directly translating a stem is not always possible.
Figure 3.
Figure 3.
Top 20 feature importances for un-optimized GTB TF1000+stemming trained on the whole dataset. English translation within parenthesis.
Figure 4.
Figure 4.
Top 20 feature importances for unoptimized GTB using TF1000 without stemming. English translation within parenthesis.

References

    1. Ducel G, Fabry J, Nicolle L. Prevention of hospital-acquired infections: a practical guide. 2nd ed. Geneva: World Health Organization, 2002, p. 1.
    1. Ehrentraut C, Tiedemann J, Dalianis H, et al. Detection of hospital acquired infections in sparse and noisy Swedish patient records. In: Proceedings of the sixth workshop on analytics for noisy unstructured text data (AND 2012), Mumbai, India, 9 December 2012, pp. 1–8. New York: ACM.
    1. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference and prediction. 2nd ed. New York: Springer, 2008, p. 758.
    1. Klompas M, Yokoe DS. Automated surveillance of health care-associated infections. Clin Infect Dis 2009; 48(9): 1268–1275. - PubMed
    1. Blacky A, Mandl H, Adlassnig KP, et al. Fully automated surveillance of healthcare-associated infections with MONI-ICU: a breakthrough in clinical infection surveillance. Appl Clin Inform 2011; 2(3): 365–372. - PMC - PubMed

Publication types

MeSH terms