Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 20;16(9):e0257520.
doi: 10.1371/journal.pone.0257520. eCollection 2021.

Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Affiliations

Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Ryan W Gan et al. PLoS One. .

Abstract

Introduction: Previous work had shown that machine learning models can predict inflammatory bowel disease (IBD)-related hospitalizations and outpatient corticosteroid use based on patient demographic and laboratory data in a cohort of United States Veterans. This study aimed to replicate this modeling framework in a nationally representative cohort.

Methods: A retrospective cohort design using Optum Electronic Health Records (EHR) were used to identify IBD patients, with at least 12 months of follow-up between 2007 and 2018. IBD flare was defined as an inpatient/emergency visit with a diagnosis of IBD or an outpatient corticosteroid prescription for IBD. Predictors included demographic and laboratory data. Logistic regression and random forest (RF) models were used to predict IBD flare within 6 months of each visit. A 70% training and 30% validation approach was used.

Results: A total of 95,878 patients across 780,559 visits were identified. Of these, 22,245 (23.2%) patients had at least one IBD flare. Patients were predominantly White (87.7%) and female (57.1%), with a mean age of 48.0 years. The logistic regression model had an area under the receiver operating curve (AuROC) of 0.66 (95% CI: 0.65-0.66), sensitivity of 0.69 (95% CI: 0.68-0.70), and specificity of 0.74 (95% CI: 0.73-0.74) in the validation cohort. The RF model had an AuROC of 0.80 (95% CI: 0.80-0.81), sensitivity of 0.74 (95% CI: 0.73-0.74), and specificity of 0.72 (95% CI: 0.72-0.72) in the validation cohort. Important predictors of IBD flare in the RF model were the number of previous flares, age, potassium, and white blood cell count.

Conclusion: The machine learning modeling framework was replicated and results showed a similar predictive accuracy in a nationally representative cohort of IBD patients. This modeling framework could be embedded in routine practice as a tool to distinguish high-risk patients for disease activity.

PubMed Disclaimer

Conflict of interest statement

Ryan Gan and Diana Sun are full time employees of Genentech, Inc., a member of the Roche group, and own shares of Roche stock. Amanda Tatro is a full time employee of F. Hoffmann La Roche AG and own shares of Roche stock. This does not alter out adherence to PLOS ONE policies on sharing data and materials. No other authors have competing interests.

Figures

Fig 1
Fig 1. Flow chart of identification of patients with inflammatory bowel disease (IBD) from the Optum electronic health records (EHR) database.
Fig 2
Fig 2
A) AuROC and B) DCA for prediction of flare in the next 6 months for the logistic regression model using demographic data, logistic regression using demographic and laboratory data, and random forest model using demographic and laboratory data. AUC = area under the curve; auROC = area under the receiver operating curve; DCA = decision curve analysis; RF = random forest model; ROC = receiver operating curve.
Fig 3
Fig 3. TreeSHAP summary plot of the top 10 most important variables for predicting a flare in the next 6 months for patients with IBD.
IBD = inflammatory bowel disease; Max = maximum.

References

    1. Kappelman MD, Rifas–Shiman SL, Kleinman K, Ollendorf D, Bousvaros A, Grand RJ, et al.. The Prevalence and Geographic Distribution of Crohn’s Disease and Ulcerative Colitis in the United States. Clin Gastroenterol Hepatol. 2007;5: 1424–1429. doi: 10.1016/j.cgh.2007.07.012 - DOI - PubMed
    1. Loftus EV. Clinical epidemiology of inflammatory bowel disease: incidence, prevalence, and environmental influences. Gastroenterology. 2004;126: 1504–1517. doi: 10.1053/j.gastro.2004.01.063 - DOI - PubMed
    1. Waljee AK, Lipson R, Wiitala WL, Zhang Y, Liu B, Zhu J, et al.. Predicting Hospitalization and Outpatient Corticosteroid Use in Inflammatory Bowel Disease Patients Using Machine Learning. Inflamm Bowel Dis. 2017;24: 45–53. doi: 10.1093/ibd/izx007 - DOI - PMC - PubMed
    1. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al.. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2: 56–67. doi: 10.1038/s42256-019-0138-9 - DOI - PMC - PubMed
    1. OPTUM® ⎯ Clinical/EHR Data [database online]. [cited 1 Jun 2019]. Available: https://www.optum.com/business/solutions/government/federal/data-analyti...

Publication types

Substances