Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
- PMID: 39815316
- PMCID: PMC11737202
- DOI: 10.1186/s12911-025-02855-6
Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
Abstract
Background: Despite the adverse health outcomes associated with longer duration diarrhea (LDD), there are currently no clinical decision tools for timely identification and better management of children with increased risk. This study utilizes machine learning (ML) to derive and validate a predictive model for LDD among children presenting with diarrhea to health facilities.
Methods: LDD was defined as a diarrhea episode lasting ≥ 7 days. We used 7 ML algorithms to build prognostic models for the prediction of LDD among children < 5 years using de-identified data from Vaccine Impact on Diarrhea in Africa study (N = 1,482) in model development and data from Enterics for Global Health Shigella study (N = 682) in temporal validation of the champion model. Features included demographic, medical history and clinical examination data collected at enrolment in both studies. We conducted split-sampling and employed K-fold cross-validation with over-sampling technique in the model development. Moreover, critical predictors of LDD and their impact on prediction were obtained using an explainable model agnostic approach. The champion model was determined based on the area under the curve (AUC) metric. Model calibrations were assessed using Brier, Spiegelhalter's z-test and its accompanying p-value.
Results: There was a significant difference in prevalence of LDD between the development and temporal validation cohorts (478 [32.3%] vs 69 [10.1%]; p < 0.001). The following variables were associated with LDD in decreasing order: pre-enrolment diarrhea days (55.1%), modified Vesikari score(18.2%), age group (10.7%), vomit days (8.8%), respiratory rate (6.5%), vomiting (6.4%), vomit frequency (6.2%), rotavirus vaccination (6.1%), skin pinch (2.4%) and stool frequency (2.4%). While all models showed good prediction capability, the random forest model achieved the best performance (AUC [95% Confidence Interval]: 83.0 [78.6-87.5] and 71.0 [62.5-79.4]) on the development and temporal validation datasets, respectively. While the random forest model showed slight deviations from perfect calibration, these deviations were not statistically significant (Brier score = 0.17, Spiegelhalter p-value = 0.219).
Conclusions: Our study suggests ML derived algorithms could be used to rapidly identify children at increased risk of LDD. Integrating ML derived models into clinical decision-making may allow clinicians to target these children with closer observation and enhanced management.
Keywords: Longer duration diarrhea; Machine Learning; Pediatric; Prediction.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: VIDA protocol was approved by the Institutional Review Board of the University of Maryland School of Medicine, Baltimore, MD, USA (UMB Protocol #: HM-HP-00062472) and the Kenya Medical Research Institute (KEMRI) Scientific and Ethical Review Unit (SERU) (SERU#2996). The EFGH protocol was approved by the KEMRI SERU (SERU#4362). Written informed consent was sought from caregivers in both studies before initiation of study procedures. Additionally, ethical approval for undertaking the current study was sought from the health research ethics committee of the University of South Africa, College of Agricultural Sciences (2023/CAES_HREC/2192). Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures








Similar articles
-
Predictive modelling of linear growth faltering among pediatric patients with Diarrhea in Rural Western Kenya: an explainable machine learning approach.BMC Med Inform Decis Mak. 2024 Dec 2;24(1):368. doi: 10.1186/s12911-024-02779-7. BMC Med Inform Decis Mak. 2024. PMID: 39623435 Free PMC article.
-
A machine learning approach to predicting inpatient mortality among pediatric acute gastroenteritis patients in Kenya.Learn Health Syst. 2024 Dec 26;9(2):e10478. doi: 10.1002/lrh2.10478. eCollection 2025 Apr. Learn Health Syst. 2024. PMID: 40247897 Free PMC article.
-
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251. Clin Orthop Relat Res. 2020. PMID: 32282466 Free PMC article.
-
A retrospective study using machine learning to develop predictive model to identify rotavirus-associated acute gastroenteritis in children.PeerJ. 2025 Apr 14;13:e19025. doi: 10.7717/peerj.19025. eCollection 2025. PeerJ. 2025. PMID: 40247842 Free PMC article.
-
Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering.Elife. 2023 Jan 6;12:e78491. doi: 10.7554/eLife.78491. Elife. 2023. PMID: 36607225 Free PMC article.
References
-
- World Health Organization. Diarrhoeal disease: Factsheet. 2024. Available at: https://www.who.int/news-room/fact-sheets/detail/diarrhoeal-disease. Accessed 17 July 2024.
-
- Giannattasio A, Guarino A, Lo Vecchio A. Management of children with prolonged diarrhea. F1000Research 2016; 5. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765715/. Accessed 25 November 2020. - PMC - PubMed
-
- Alam NH, Ashraf H. Treatment of Infectious Diarrhea in Children. Pediatr Drugs. 2003;5:151–65. - PubMed
-
- Keusch GT, Walker CF, Das JK, Horton S, Habte D. Diarrheal Diseases. In: Black RE, Laxminarayan R, Temmerman M, Walker N, eds. Reproductive, Maternal, Newborn, and Child Health: Disease Control Priorities, Third Edition (Volume 2). Washington (DC): The International Bank for Reconstruction and Development / The World Bank, 2016. Available at: http://www.ncbi.nlm.nih.gov/books/NBK361905/. Accessed 13 January 2023. - PubMed
-
- Strand TA, Sharma PR, Gjessing HK, et al. Risk Factors for Extended Duration of Acute Diarrhea in Young Children. PLoS ONE 2012; 7. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3348155/. Accessed 27 November 2020. - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical