Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 28:11:e2916.
doi: 10.7717/peerj-cs.2916. eCollection 2025.

Comparative performance of twelve machine learning models in predicting COVID-19 mortality risk in children: a population-based retrospective cohort study in Brazil

Affiliations

Comparative performance of twelve machine learning models in predicting COVID-19 mortality risk in children: a population-based retrospective cohort study in Brazil

Adriano Lages Dos Santos et al. PeerJ Comput Sci. .

Abstract

The COVID-19 pandemic has catalyzed the application of advanced digital technologies such as artificial intelligence (AI) to predict mortality in adult patients. However, the development of machine learning (ML) models for predicting outcomes in children and adolescents with COVID-19 remains limited. This study aimed to evaluate the performance of multiple machine learning models in forecasting mortality among hospitalized pediatric COVID-19 patients. In this cohort study, we used the SIVEP-Gripe dataset, a public resource maintained by the Ministry of Health, to track severe acute respiratory syndrome (SARS) in Brazil. To create subsets for training and testing the machine learning (ML) models, we divided the primary dataset into three parts. Using these subsets, we developed and trained 12 ML algorithms to predict the outcomes. We assessed the performance of these models using various metrics such as accuracy, precision, sensitivity, recall, and area under the receiver operating characteristic curve (AUC). Among the 37 variables examined, 24 were found to be potential indicators of mortality, as determined by the chi-square test of independence. The Logistic Regression (LR) algorithm achieved the highest performance, with an accuracy of 92.5% and an AUC of 80.1%, on the optimized dataset. Gradient boosting classifier (GBC) and AdaBoost (ADA), closely followed the LR algorithm, producing similar results. Our study also revealed that baseline reduced oxygen saturation, presence of comorbidities, and older age were the most relevant factors in predicting mortality in children and adolescents hospitalized with SARS-CoV-2 infection. The use of ML models can be an asset in making clinical decisions and implementing evidence-based patient management strategies, which can enhance patient outcomes and overall quality of medical care. LR, GBC, and ADA models have demonstrated efficiency in accurately predicting mortality in COVID-19 pediatric patients.

Keywords: Artificial intelligence; COVID-19; Children; Death prediction; Healthcare; Machine learning; Mortality; Risk.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. Performance of the developed models for the metrics chosen.
Model performance for each type of metric. (A) Model performance with the AUC metric. (B) Accuracy of the developed models. (C) Precision metric for the developed models. (D) Recall metric. (E) Sensitivity metric. (F) F1-score-the harmonic mean between recall and precision
Figure 2
Figure 2. ROC curves of the three best ML models for Dataset 3 that achieved better results.
Figure 3
Figure 3. A summary plot of SHAP values for mortality prediction on Dataset 3 (features selected by chi-squared test).
Blue dots indicate that low values in feature contribute to model to classify patient as discharge and red dots indicate high values of a feature contributes to model to classify patient as dead. More important features are in order top to bottom.

Similar articles

References

    1. Aktar S, Talukder A, Ahamad MM, Kamal AHM, Khan JR, Protikuzzaman M, Hossain N, Azad AKM, Quinn JMW, Summers MA, Liaw T, Eapen V, Moni MA. Machine learning approaches to identify patient comorbidities and symptoms that increased risk of mortality in COVID-19. Diagnostics. 2021;11:1383. doi: 10.3390/diagnostics11081383. - DOI - PMC - PubMed
    1. Allenbach Y, Saadoun D, Maalouf G, Vieira M, Hellio A, Boddaert J, Gros H, Salem JE, Resche Rigon M, Menyssa C, Biard L, Benveniste O, Cacoub P, Dimicovid Development of a multivariate prediction model of intensive care unit transfer or death: a French prospective cohort study of hospitalized COVID-19 patients. PLOS ONE. 2020;15(10):e0240711. doi: 10.1371/journal.pone.0240711. - DOI - PMC - PubMed
    1. An C, Lim H, Kim DW, Chang JH, Choi YJ, Kim SW. Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study. Scientific Reports. 2020;10(1):18716. doi: 10.1038/s41598-020-75767-2. - DOI - PMC - PubMed
    1. Assaf D, Gutman Y, Neuman Y, Segal G, Amit S, Gefen-Halevi S, Shilo N, Epstein A, Mor-Cohen R, Biber A, Rahav G, Levy I, Tirosh A. Utilization of machine-learning models to accurately predict the risk for critical COVID-19. Internal and Emergency Medicine. 2020;15(8):1435–1443. doi: 10.1007/s11739-020-02475-0. - DOI - PMC - PubMed
    1. Banoei MM, Dinparastisaleh R, Zadeh AV, Mirsaeidi M. Machine-learning-based COVID-19 mortality prediction model and identification of patients at low and high risk of dying. Critical Care. 2021;25(1):328. doi: 10.1186/s13054-021-03749-5. - DOI - PMC - PubMed

LinkOut - more resources