Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2023 Mar 1;13(1):3463.
doi: 10.1038/s41598-023-28579-z.

Potential and limitations of machine meta-learning (ensemble) methods for predicting COVID-19 mortality in a large inhospital Brazilian dataset

Bruno Barbosa Miranda de Paiva  1 Polianna Delfino Pereira  2   3 Claudio Moisés Valiense de Andrade  1 Virginia Mara Reis Gomes  4 Maira Viana Rego Souza-Silva  4 Karina Paula Medeiros Prado Martins  4 Thaís Lorenna Souza Sales  5 Rafael Lima Rodrigues de Carvalho  3 Magda Carvalho Pires  6 Lucas Emanuel Ferreira Ramos  6 Rafael Tavares Silva  6 Alessandra de Freitas Martins Vieira  7 Aline Gabrielle Sousa Nunes  8 Alzira de Oliveira Jorge  9 Amanda de Oliveira Maurílio  10 Ana Luiza Bahia Alves Scotton  11 Carla Thais Candida Alves da Silva  12 Christiane Corrêa Rodrigues Cimini  13 Daniela Ponce  14 Elayne Crestani Pereira  15 Euler Roberto Fernandes Manenti  16 Fernanda d'Athayde Rodrigues  17 Fernando Anschau  18 Fernando Antônio Botoni  19 Frederico Bartolazzi  12 Genna Maira Santos Grizende  20 Helena Carolina Noal  21 Helena Duani  4 Isabela Moraes Gomes  4 Jamille Hemétrio Salles Martins Costa  22 Júlia di Sabatino Santos Guimarães  19 Julia Teixeira Tupinambás  23 Juliana Machado Rugolo  14 Joanna d'Arc Lyra Batista  24 Joice Coutinho de Alvarenga  25 José Miguel Chatkin  26 Karen Brasil Ruschel  16 Liege Barella Zandoná  27 Lílian Santos Pinheiro  13 Luanna Silva Monteiro Menezes  23   28 Lucas Moyses Carvalho de Oliveira  29 Luciane Kopittke  18 Luisa Argolo Assis  30 Luiza Margoto Marques  7 Magda Cesar Raposo  5 Maiara Anschau Floriani  31   32 Maria Aparecida Camargos Bicalho  33 Matheus Carvalho Alves Nogueira  34 Neimy Ramos de Oliveira  35 Patricia Klarmann Ziegelmann  36 Pedro Gibson Paraiso  37 Petrônio José de Lima Martelli  38 Roberta Senger  21 Rochele Mosmann Menezes  39 Saionara Cristina Francisco  40 Silvia Ferreira Araújo  41 Tatiana Kurtz  39 Tatiani Oliveira Fereguetti  35 Thainara Conceição de Oliveira  42 Yara Cristina Neves Marques Barbosa Ribeiro  40 Yuri Carlotto Ramires  27 Maria Clara Pontello Barbosa Lima  43 Marcelo Carneiro  39 Adriana Falangola Benjamin Bezerra  38 Alexandre Vargas Schwarzbold  21 André Soares de Moura Costa  34 Barbara Lopes Farace  9 Daniel Vitorio Silveira  8 Evelin Paola de Almeida Cenci  42 Fernanda Barbosa Lucas  12 Fernando Graça Aranha  15 Gisele Alsina Nader Bastos  31 Giovanna Grunewald Vietta  15 Guilherme Fagundes Nascimento  8 Heloisa Reniers Vianna  29 Henrique Cerqueira Guimarães  9 Julia Drumond Parreiras de Morais  29 Leila Beltrami Moreira  17 Leonardo Seixas de Oliveira  13 Lucas de Deus Sousa  11 Luciano de Souza Viana  22 Máderson Alvares de Souza Cabral  4 Maria Angélica Pires Ferreira  17 Mariana Frizzo de Godoy  26 Meire Pereira de Figueiredo  12 Milton Henriques Guimarães-Junior  22 Mônica Aparecida de Paula de Sordi  14 Natália da Cunha Severino Sampaio  35 Pedro Ledic Assaf  40 Raquel Lutkmeier  18 Reginaldo Aparecido Valacio  23 Renan Goulart Finger  44 Rufino de Freitas  10 Silvana Mangeon Meirelles Guimarães  41 Talita Fischer Oliveira  23 Thulio Henrique Oliveira Diniz  10 Marcos André Gonçalves  1 Milena Soriano Marcolino  45   46   47
Affiliations
Multicenter Study

Potential and limitations of machine meta-learning (ensemble) methods for predicting COVID-19 mortality in a large inhospital Brazilian dataset

Bruno Barbosa Miranda de Paiva et al. Sci Rep. .

Abstract

The majority of early prediction scores and methods to predict COVID-19 mortality are bound by methodological flaws and technological limitations (e.g., the use of a single prediction model). Our aim is to provide a thorough comparative study that tackles those methodological issues, considering multiple techniques to build mortality prediction models, including modern machine learning (neural) algorithms and traditional statistical techniques, as well as meta-learning (ensemble) approaches. This study used a dataset from a multicenter cohort of 10,897 adult Brazilian COVID-19 patients, admitted from March/2020 to November/2021, including patients [median age 60 (interquartile range 48-71), 46% women]. We also proposed new original population-based meta-features that have not been devised in the literature. Stacking has shown to achieve the best results reported in the literature for the death prediction task, improving over previous state-of-the-art by more than 46% in Recall for predicting death, with AUROC 0.826 and MacroF1 of 65.4%. The newly proposed meta-features were highly discriminative of death, but fell short in producing large improvements in final prediction performance, demonstrating that we are possibly on the limits of the prediction capabilities that can be achieved with the current set of ML techniques and (meta-)features. Finally, we investigated how the trained models perform on different hospitals, showing that there are indeed large differences in classifier performance between different hospitals, further making the case that errors are produced by factors that cannot be modeled with the current predictors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Info-gain on the best model features, including populational and classifier output based meta-features. FiO2: fraction of inspired oxygen; GAM: generalized additive models; KNN: K-nearest neighbors; LightGBM: light gradient boosting machines; pO2: partial pressure of oxygen; RF: random forest; SVM: support vector machines.
Figure 2
Figure 2
Receiver operating characteristic (ROC) Curve comparing multiple models, trained on the prediction of the death outcome.
Figure 3
Figure 3
A sample decision tree with depth 2, trained on our dataset. At each level but the last, the first line of text in each box shows the variable and its cut before the split.
Figure 4
Figure 4
Mutual information scores on the top-20 base patient features (A) and Pearson correlation scores between the top-20 most predictive features and lethality. Adm: admission; ALT: alanine aminotransferase; AST: aspartate transaminase; FiO2: fraction of inspired oxygen; TTPA: partial activated thromboplastin time.
Figure 5
Figure 5
Pearson correlation scores between model variables and false positive/negative errors. Adm: admission; FiO2: fraction of inspired oxygen; INR: international normalized ratio; pCO2: partial pressure of carbon dioxide.
Figure 6
Figure 6
Comparison of mean normalized values between errors and non-errors. Adm: admission; FiO2: fraction of inspired oxygen; INR: international normalized ratio; pCO2: partial pressure of carbon dioxide.
Figure 7
Figure 7
Comparison of ROC curves and AUROC results between different hospitals.
Figure 8
Figure 8
Comparison of ROC curves and AUROC results between different hospitals.
Figure 9
Figure 9
Error rates for each confidence threshold in the Stacking model without populational meta-features (which had the best macro-F1 result). The X-axis shows prediction ranges for the model's confidence score, while the y-axis shows the percentage of hits or misses for the model.

References

    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020 doi: 10.1016/S1473-3099(20)30120-1. - DOI - PMC - PubMed
    1. Callaway E. Could new COVID variants undermine vaccines? Labs scramble to find out. Nature. 2021;589(7841):177–178. doi: 10.1038/d41586-021-00031-0. - DOI - PubMed
    1. Fumagalli C, Rozzini R, Vannini M, Coccia F, Cesaroni G, Mazzeo F, et al. Clinical risk score to predict in-hospital mortality in COVID-19 patients: A retrospective cohort study. BMJ Open. 2020;10(9):e040729. doi: 10.1136/bmjopen-2020-040729. - DOI - PMC - PubMed
    1. Bertsimas D, Lukin G, Mingardi L, Nohadani O, Orfanoudaki A, Stellato B, et al. COVID-19 mortality risk assessment: An international multi-center study. PLoS ONE. 2020;15(12):e0243262. doi: 10.1371/journal.pone.0243262. - DOI - PMC - PubMed
    1. Lee JY, Nam BH, Kim M, et al. A risk scoring system to predict progression to severe pneumonia in patients with Covid-19. Sci. Rep. 2022;12(1):5390. doi: 10.1038/s41598-022-07610-9. - DOI - PMC - PubMed

Publication types