Derivation and Validation of Predictive Models for Early Pediatric Sepsis
- PMID: 41082207
- PMCID: PMC12519407
- DOI: 10.1001/jamapediatrics.2025.3892
Derivation and Validation of Predictive Models for Early Pediatric Sepsis
Abstract
Importance: Sepsis is a leading cause of death in children. Early recognition and treatment improve outcomes, but predictive models have not to date improved early diagnosis.
Objective: To develop machine learning models to estimate the probability of developing sepsis in the subsequent 48 hours.
Design, setting, and participants: This was a multisite registry for model derivation and validation using electronic health record (EHR) data from January 2016 through February 2020 and temporal validation from January 2021 through December 2022. The performance of machine learning algorithms was compared to predict development of sepsis and septic shock via logistic regression, specifically ridge regression and gradient tree boosting. Five health systems contributing to the Pediatric Emergency Care Applied Research Network were included. Emergency department (ED) visits for children aged 2 months or older to less than 18 years of age excluding patients with ED disposition of death or transfer, trauma diagnosis, or sepsis present during predictive features window. The TRIPOD-AI reporting guideline was followed, and data analysis was conducted from September 2023 to July 2025.
Exposures: Patient and physiologic characteristics within the first 4 hours of ED care.
Main outcomes and measures: Sepsis, defined as suspected infection with a Phoenix Sepsis Criteria (PSC) score of 2 or more or death within 48 hours of ED arrival.
Results: The dataset included 1 604 422 eligible visits in the training cohort and 719 298 visits in the test cohort. Performance characteristics for the PSC sepsis prediction models were AUROC of 0.92 (95% CI, 0.92-0.93) for logistic regression and 0.94 (95% CI, 0.93-0.94) for gradient tree boosting. AUROCs for PSC shock models were 0.92 or greater. The gradient tree boosting models had positive likelihood ratios ranging from 4.67 (95% CI, 4.61-4.74) to 6.18 (95% CI, 6.08-6.28) for sepsis and from 4.16 (95% CI, 4.07-4.24) to 5.83 (95% CI, 5.67-5.99) for septic shock. Predictive features included emergency severity index, age-adjusted vital signs, and medical complexity. Assessment of model performance fairness was similar for all demographic characteristics except payor; AUROC for patients with Medicaid insurance was better than for those with commercial payers.
Conclusions and relevance: Using a large multicenter population, models were developed and validated with high AUROC to predict the future development of sepsis based on EHR data collected in the ED. The models achieved positive likelihood ratios to predict sepsis and septic shock. The results highlight the opportunity for future studies that combine EHR-based models with clinical judgment to improve prediction.
Conflict of interest statement
References
LinkOut - more resources
Full Text Sources
