Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 15:765:142723.
doi: 10.1016/j.scitotenv.2020.142723. Epub 2020 Oct 6.

Evaluating the plausible application of advanced machine learnings in exploring determinant factors of present pandemic: A case for continent specific COVID-19 analysis

Affiliations

Evaluating the plausible application of advanced machine learnings in exploring determinant factors of present pandemic: A case for continent specific COVID-19 analysis

Suman Chakraborti et al. Sci Total Environ. .

Abstract

Coronavirus disease, a novel severe acute respiratory syndrome (SARS COVID-19), has become a global health concern due to its unpredictable nature and lack of adequate medicines. Machine Learning (ML) models could be effective in identifying the most critical factors which are responsible for the overall fatalities caused by COVID-19. The functional capabilities of ML models in epidemiological research, especially for COVID-19, are not substantially explored. To bridge this gap, this study has adopted two advanced ML models, viz. Random Forest (RF) and Gradient Boosted Machine (GBM), to perform the regression modelling and provide subsequent interpretation. Five successive steps were followed to carry out the analysis: (1) identification of relevant key explanatory variables; (2) application of data dimensionality reduction for eliminating redundant information; (3) utilizing ML models for measuring relative influence (RI) of the explanatory variables; (4) evaluating interconnections between and among the key explanatory variables and COVID-19 case and death counts; (5) time series analysis for examining the rate of incidences of COVID-19 cases and deaths. Among the explanatory variables considered in this study, air pollution, migration, economy, and demographic factor were found to be the most significant controlling factors. Since a very limited research is available to discuss the superiority of ML models for identifying the key determinants of COVID-19, this study could be a reference for future public health research. Additionally, all the models and data used in this study are open source and freely available, thereby, reproducibility and scientific replication will be achievable easily.

Keywords: Air pollution; COVID-19; Machine learning; Pandemic; Relative importance; Socioeconomic.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Unlabelled Image
Graphical abstract
Fig. 1
Fig. 1
Spatial distribution of global COVID-19 case and death (per 100,000 persons) scenario. Also, continent specific daily progression of COVID-19 cases and deaths are showing in the bottom left corner.
Fig. 2
Fig. 2
Alluvial plot shows the strength of interconnection between the explanatory variables predicted COVID cases derived from random forest algorithm. Relative influence (RI) values of each variable are shown in the right side of each plot.
Fig. 3
Fig. 3
Alluvial plot shows the strength of interconnection between the explanatory variables predicted COVID deaths derived from random forest algorithm. Relative influence (RI) values of each variable are shown in the right side of each plot.
Fig. 4
Fig. 4
The predictive power of the explanatory variables computed for COVID cases derived from the random forest algorithm.
Fig. 5
Fig. 5
The predictive power of the explanatory variables computed for COVID death derived from the random forest algorithm.
Fig. 6
Fig. 6
An overall comprehensive global pandemic preparedness path to highlight the strategies that need to be given importance.

Similar articles

Cited by

References

    1. AbdelMassih, Antoine, Ramy Ghaly, Abeer Amin, Amr Gaballah, Aya Kamel, Bassant Heikal, Esraa Menshawey et al. "Obese communities among the best predictors of COVID-19-related deaths." Cardiovascular Endocrinology & Metabolism (2020). - PMC - PubMed
    1. Adekunle I.A., Onanuga A.T., Akinola O.O., Ogunbanjo O.W. Modelling spatial variations of coronavirus disease (COVID-19) in Africa. Sci. Total Environ. 2020;729:138998. doi: 10.1016/j.scitotenv.2020.13899. - DOI - PMC - PubMed
    1. Ahmadi M., Sharifi A., Dorosti S., Jafarzadeh Ghoushchi S., Ghanbari N. Investigation of effective climatology parameters on COVID-19 outbreak in Iran. Sci. Total Environ. 2020;729 doi: 10.1016/j.scitotenv.2020.138705. - DOI - PMC - PubMed
    1. Alsayed A., Sadir H., Kamil R., Sari H. Prediction of epidemic peak and infected cases for COVID-19 disease in Malaysia, 2020. Int. J. Environ. Res. Public Health. 2020;17:1–15. doi: 10.3390/ijerph17114076. - DOI - PMC - PubMed
    1. Azevedo L., Pereira M.J., Ribeiro M.C., Soares A. Geostatistical COVID-19 infection risk maps for Portugal. Int. J. Health Geogr. 2020;19:25. doi: 10.1186/s12942-020-00221-5. - DOI - PMC - PubMed