Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 24;197(11):E286-E297.
doi: 10.1503/cmaj.241117.

Machine learning prediction of premature death from multimorbidity among people with inflammatory bowel disease: a population-based retrospective cohort study

Affiliations

Machine learning prediction of premature death from multimorbidity among people with inflammatory bowel disease: a population-based retrospective cohort study

Gemma Postill et al. CMAJ. .

Abstract

Background: Multimorbidity, the co-occurrence of 2 or more chronic conditions, is important in patients with inflammatory bowel disease (IBD) given its association with complex care plans, poor health outcomes, and excess mortality. Our objectives were to describe premature death (age < 75 yr) among people with IBD and to identify patterns between multimorbidity and premature death among decedents with IBD.

Methods: Using the administrative health data of people with IBD who died between 2010 and 2020 in Ontario, Canada, we conducted a population-based, retrospective cohort study. We described the proportion of premature deaths among people with IBD. We developed statistical and machine learning models to predict premature death from the presence of 17 chronic conditions and the patients' age at diagnosis. We evaluated models using accuracy, positive predictive value, sensitivity, F1 scores, area under the receiver operating curve (AUC), calibration plots, and explainability plots.

Results: All models showed strong performance (AUC 0.81-0.95). The best performing was the model that incorporated age at diagnosis for each chronic condition developed at or before age 60 years (AUC 0.95, 95% confidence interval 0.94-0.96). Salient features for predicting premature death were young ages of diagnosis for mood disorder, osteo-and other arthritis types, other mental health disorders, and hypertension, as well as male sex.

Interpretation: By comparing results from multiple approaches modelling the impact of chronic conditions on premature death among people with IBD, we showed that conditions developed early in life (age ≤ 60 yr) and their age of onset were important for predicting their health trajectory. Clinically, our findings emphasize the need for models of care that ensure people with IBD have access to high-quality, multidisciplinary health care.

PubMed Disclaimer

Conflict of interest statement

Competing interests:: Eric Benchimol reports funding from the Canadian Association of Gastroenterology, the Canadian Institutes of Health Research (CIHR), Crohn’s Colitis Canada, and Helmsley Charitable Trust; travel support from Samsung Bioepis; and stipends from Catrile & Associates, Cleveland Clinic, University of Calgary, HMP Global, the College of Physicians and Surgeons of Ontario, European Horizon Grant Program, and the Asian Pan-Pacific Society for Pediatric Gastroenterology, Hepatology and Nutrition (APPSPGHAN). He has acted as a consultant for McKesson Canada and the Dairy Farmers of Ontario for matters unrelated to medications used to treat inflammatory bowel disease. He has also acted as a consultant for the Canadian Drug Agency. He reports membership and board roles with Crohn’s and Colitis Canada, Crohn’s and Colitis Young Adult Network, CIHR, Empowering Next-Generation Researchers in Perinatal and Child Health, Health Canada, and the Canadian Children Inflammatory Bowel Disease Network. No other competing interests were declared.

Figures

Figure 1:
Figure 1:
Overview of prediction pipeline and models used across prediction tasks. We specified 3 modelling tasks for predicting premature death among people with inflammatory bowel disease (IBD). We then used 3 types of models, namely logistic regression, random forest, and Extreme Gradient Boosting (XGBoost); XGBoost was the only model used for task 3 as it enabled direct modelling of missing data (those without conditions would have missing data). See Related Content for accessible version. Note: CCS = chronic coronary syndrome, CHF = congestive heart failure, COPD = chronic obstructive pulmonary disease, ED = emergency department, HTN = hypertension, MI = myocardial infarction, RA = rheumatoid arthritis.
Figure 2:
Figure 2:
Feature importance for predicting premature death from (A) presence or absence of all chronic conditions (Extreme Gradient Boosting 1 [XGB1] model), (B) presence or absence of chronic conditions developed at or before age 60 years (XGB2 model), and (C) age of chronic conditions developed at or before age 60 years (XGB3 model), as denoted by the Shapley Additive Explanations (SHAP) analysis. Each dot represents a single patient in the data set. The X axis shows SHAP values, which quantify the impact of each predictor variable’s value on the model’s output. Positive SHAP values indicate a higher likelihood of premature death, while negative values indicate a lower likelihood. Colours represent the variable values for each condition across each patient in the data set. In subplots A and B, red values indicate the condition was present, while blue values indicate that the condition was absent. In subplot C, red corresponds to an older age of condition diagnosis (up to age 60 yr) and blue corresponds to younger age. In all panels, sex is distinguished such that red indicates males and blue indicates females. Grey values represent patients with missing values (e.g., no diagnosis of condition). The correlation between blue dots and positive SHAP values indicates that younger ages of diagnosis were primarily used by the model to predict premature death. The vertical dispersion of points for each feature reflects variability in its impact across patients. This visualization helps identify key predictors and their relative influence on model decisions. See Related Content for accessible version. Note: COPD: chronic obstructive pulmonary disease.

References

    1. Ng SC, Yun Shi H, Hamidi N, et al. . Worldwide incidence and prevalence of inflammatory bowel disease in the 21st century: a systematic review of population-based studies. Lancet 2017;390:2769–78. - PubMed
    1. Coward S, Benchimol EI, Bernstein CN, et al. . Forecasting the incidence and prevalence of inflammatory bowel disease: a Canadian nationwide analysis. Am J Gastroenterol 2024;119:1563–70. - PMC - PubMed
    1. Card T, Hubbard R, Logan RFA. Mortality in inflammatory bowel disease: a population-based cohort study. Gastroenterology 2003;125:1583–90. - PubMed
    1. Kuenzig ME, Manuel DG, Donelle J, et al. . Life expectancy and health-adjusted life expectancy in people with inflammatory bowel disease. CMAJ 2020;192:E1394–402. - PMC - PubMed
    1. Subedi R, Greenberg TL, Roshanafshar S. Does geography matter in mortality? An analysis of potentially avoidable mortality by remoteness index in Canada. Health Rep 2019;30:3–15. - PubMed

Publication types

LinkOut - more resources