Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 21;24(1):338.
doi: 10.1186/s12879-024-09195-2.

Empowering child health: Harnessing machine learning to predict acute respiratory infections in Ethiopian under-fives using demographic and health survey insights

Affiliations

Empowering child health: Harnessing machine learning to predict acute respiratory infections in Ethiopian under-fives using demographic and health survey insights

Mulugeta Hayelom Kalayou et al. BMC Infect Dis. .

Abstract

Background: A dearth of studies showed that infectious diseases cause the majority of deaths among under-five children. Worldwide, Acute Respiratory Infection (ARI) continues to be the second most frequent cause of illness and mortality among children under the age of five. The paramount disease burden in developing nations, including Ethiopia, is still ARI.

Objective: This study aims to determine the magnitude and predictors of ARI among under-five children in Ethiopia using used state of the art machine learning algorithms.

Methods: Data for this study were derived from the 2016 Ethiopian Demographic and Health Survey. To predict the determinants of acute respiratory infections, we performed several experiments on ten machine learning algorithms (random forests, decision trees, support vector machines, Naïve Bayes, and K-nearest neighbors, Lasso regression, GBoost, XGboost), including one classic logistic regression model and an ensemble of the best performing models. The prediction ability of each machine-learning model was assessed using receiver operating characteristic curves, precision-recall curves, and classification metrics.

Results: The total ARI prevalence rate among 9501 under-five children in Ethiopia was 7.2%, according to the findings of the study. The overall performance of the ensemble model of SVM, GBoost, and XGBoost showed an improved performance in classifying ARI cases with an accuracy of 86%, a sensitivity of 84.6%, and an AUC-ROC of 0.87. The highest performing predictive model (the ensemble model) showed that the child's age, history of diarrhea, wealth index, type of toilet, mother's educational level, number of living children, mother's occupation, and type of fuel they used were an important predicting factor for acute respiratory infection among under-five children.

Conclusion: The intricate web of factors contributing to ARI among under-five children was identified using an advanced machine learning algorithm. The child's age, history of diarrhea, wealth index, and type of toilet were among the top factors identified using the ensemble model that registered a performance of 86% accuracy. This study stands as a testament to the potential of advanced data-driven methodologies in unraveling the complexities of ARI in low-income settings.

Keywords: Acute respiratory infection; Artificial intelligence; Ethiopia; FAIR; Machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Cramer’s V Correlation heatmap
Fig. 2
Fig. 2
AUC-ROC values of the trained models (The comparison of training and test set AUC-ROC values are presented as a supplementary file (Appendix))
Fig. 3
Fig. 3
The precision-recall curve of the trained model (The comparison of AUC-PRC values for the training and test sets are presented in the supplementary file (Appendix))
Fig. 4
Fig. 4
Feature importance identified by the ensemble model
Fig. 5
Fig. 5
SHAP feature impact on model prediction
Fig. 6
Fig. 6
FAIR indicator assessment of data and metadata of source code. RDA = Research Data Alliance, F1-F4 = Findability indicators, A1-A2 = Accessibility indicators, D = Data, M = Metadata

Similar articles

Cited by

References

    1. Black RE, Cousens S, Johnson HL, Lawn JE, Rudan I, Bassani DG, et al. Global, regional, and national causes of child mortality in 2008: a systematic analysis. Lancet. 2010;375(9730):1969–87. doi: 10.1016/S0140-6736(10)60549-1. - DOI - PubMed
    1. World Health Organization. world-health-statistics-2015.pdf [Internet]. Available from: https://www.who.int/docs/default-source/gho-documents/world-health-stati....
    1. Broor S, Parveen S, Bharaj P, Prasad VS, Srinivasulu KN, Sumanth KM, et al. A prospective three-year cohort study of the epidemiology and virology of acute respiratory infections of children in rural India. PLoS ONE. 2007;2(6):e491. doi: 10.1371/journal.pone.0000491. - DOI - PMC - PubMed
    1. Gupta GR. Tackling pneumonia and diarrhoea: the deadliest diseases for the world’s poorest children. Lancet. 2012;379(9832):2123–4. doi: 10.1016/S0140-6736(12)60907-6. - DOI - PubMed
    1. Young M, Wolfheim C, Marsh DR, Hammamy D. World Health Organization/United Nations children’s fund joint statement on integrated community case management: an equity-focused strategy to improve access to essential treatment services for children. Am J Trop Med Hyg. 2012;87(5 Suppl):6. doi: 10.4269/ajtmh.2012.12-0221. - DOI - PMC - PubMed