Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr;39(4):425-433.
doi: 10.1017/ice.2018.16.

A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers

Affiliations

A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers

Jeeheh Oh et al. Infect Control Hosp Epidemiol. 2018 Apr.

Abstract

OBJECTIVE An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. To date, research has focused on developing risk prediction models for CDI that work well across institutions. However, this one-size-fits-all approach ignores important hospital-specific factors. We focus on a generalizable method for building facility-specific models. We demonstrate the applicability of the approach using electronic health records (EHR) from the University of Michigan Hospitals (UM) and the Massachusetts General Hospital (MGH). METHODS We utilized EHR data from 191,014 adult admissions to UM and 65,718 adult admissions to MGH. We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 4,836 features from patients at UM and 1,837 from patients at MGH. We used L2 regularized logistic regression to learn the models, and we measured the discriminative performance of the models on held-out data from each hospital. RESULTS Using the UM and MGH test data, the models achieved area under the receiver operating characteristic curve (AUROC) values of 0.82 (95% confidence interval [CI], 0.80-0.84) and 0.75 ( 95% CI, 0.73-0.78), respectively. Some predictive factors were shared between the 2 models, but many of the top predictive factors differed between facilities. CONCLUSION A data-driven approach to building models for estimating daily patient risk for CDI was used to build institution-specific models at 2 large hospitals with different patient populations and EHR systems. In contrast to traditional approaches that focus on developing models that apply across hospitals, our generalizable approach yields risk-stratification models tailored to an institution. These hospital-specific models allow for earlier and more accurate identification of high-risk patients and better targeting of infection prevention strategies. Infect Control Hosp Epidemiol 2018;39:425-433.

PubMed Disclaimer

Conflict of interest statement

Potential conflicts of interest: All authors report no conflicts of interest relevant to this article.

Figures

FIGURE 1.
FIGURE 1.
Inclusion and exclusion criteria and demographics of study populations. The inclusion and exclusion criteria for the study population at each institution are shown, along with the demographics of the final study populations. The period for inclusion, length of stay duration requirements, and kind of visit differed slightly between study populations. The same exclusion criteria were applied with regards to history of CDI within 14 days prior to admission and positive CDI within 2 calendar days of admission for both study populations. The final study populations comprised 191,014 and 65,718 adult inpatient encounters at UM and MGH, respectively.
FIGURE 2.
FIGURE 2.
Discriminative performance of the institution specific classifiers on their respective held-out test sets. The receiver operating characteristics curves illustrate the tradeoff in performance between the false-positive rate (1-specificity) and the true-positive rate (sensitivity). Both classifiers achieve good discriminative performance as measured by the area under the ROC curve (AUROC): an AUROC of 0.82 and an AUROC of 0.75 at UM and MGH, respectively.
Figure 3.
Figure 3.
Measuring model calibration. Predictions are grouped into quintiles by predicted risk and plotted against observed CDI incidence rate within each quintile. Points that fall closer to the “y = x” line are better calibrated. Classifiers for both institutions appear to be well calibrated. This is also evident in their low Brier scores: both classifiers have scores of 0.01. Brier scores measure the accuracy of probabilistic predictions and range from 0 to 1, where 0 represents perfectly calibrated predictions. The calibration plot for UM is shown on the left, with MGH on the right.
Figure 4.
Figure 4.
Confusion matrices of the institution specific classifiers on their respective held-out test sets. Selecting a decision threshold based on the 95th percentile results in classifiers that achieves very good specificity 95.2% at both institutions and relatively good positive predictive values of 5.6% and 4.4% at UM and MGH, respectively. For perspective, the baseline positive predictive values (ie, fraction of positive cases) at each institution are 1.00% and 0.74%, respectively. Thus, both were approximately 6 times better than the baseline.
FIGURE 5.
FIGURE 5.
Measuring how far in advance the model correctly identifies cases. Using a threshold based on the 95th percentile, we measure the time from when each positive patient first crosses that threshold to when they are clinically diagnosed with CDI. At both institutions, of those patients who are correctly identified as positive (ie, the true positives) the model identifies half of the patients at least 5 days in advance (black dashed line represents the median). The plot for UM is shown on the left, with MGH on the right.

References

    1. Lessa FC, Winston LG, McDonald LC. Burden of Clostridium difficile infection in the United States. N Engl J Med 2015; 372:2369–2370. - PMC - PubMed
    1. Barbut F, Surgers L, Eckert C, et al. Does a rapid diagnosis of Clostridium difficile infection impact on quality of patient management? Clin Microbiol Infect 2014;20:136–144. - PubMed
    1. Cohen SH, Gerding DN, Johnson S, et al. Clinical practice guidelines for Clostridium difficile infection in adults: 2010 update by the society for healthcare epidemiology of America (SHEA) and the infectious diseases society of America (IDSA). Infect Control Hosp Epidemiol 2010;31:431–455. - PubMed
    1. Dubberke ER, Carling P, Carrico R, et al. Strategies to prevent Clostridium difficile infections in acute care hospitals: 2014 update. Infect Control Hosp Epidemiol 2014;35:628–645. - PubMed
    1. Na X, Martin AJ, Sethi S, et al. A multi-center prospective derivation and validation of a clinical prediction tool for severe Clostridium difficile infection. PLoS One 2015;10:e0123405. - PMC - PubMed

Publication types

MeSH terms