Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 15:209:118377.
doi: 10.1016/j.eswa.2022.118377. Epub 2022 Aug 5.

Dietary, comorbidity, and geo-economic data fusion for explainable COVID-19 mortality prediction

Affiliations

Dietary, comorbidity, and geo-economic data fusion for explainable COVID-19 mortality prediction

Milena Trajanoska et al. Expert Syst Appl. .

Abstract

Many factors significantly influence the outcomes of infectious diseases such as COVID-19. A significant focus needs to be put on dietary habits as environmental factors since it has been deemed that imbalanced diets contribute to chronic diseases. However, not enough effort has been made in order to assess these relations. So far, studies in the field have shown that comorbid conditions influence the severity of COVID-19 symptoms in infected patients. Furthermore, COVID-19 has exhibited seasonal patterns in its spread; therefore, considering weather-related factors in the analysis of the mortality rates might introduce a more relevant explanation of the disease's progression. In this work, we provide an explainable analysis of the global risk factors for COVID-19 mortality on a national scale, considering dietary habits fused with data on past comorbidity prevalence and environmental factors such as seasonally averaged temperature geolocation, economic and development indices, undernourished and obesity rates. The innovation in this paper lies in the explainability of the obtained results and is equally essential in the data fusion methods and the broad context considered in the analysis. Apart from a country's age and gender distribution, which has already been proven to influence COVID-19 mortality rates, our empirical analysis shows that countries with imbalanced dietary habits generally tend to have higher COVID-19 mortality predictions. Ultimately, we show that the fusion of the dietary data set with the geo-economic variables provides more accurate modeling of the country-wise COVID-19 mortality rates with respect to considering only dietary habits, proving the hypothesis that fusing factors from different contexts contribute to a better descriptive analysis of the COVID-19 mortality rates.

Keywords: COVID-19 mortality prediction; Comorbidity; Data fusion; Dietary habits; Geo-economic factors.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
The Machine Learning Pipeline. The figure represents the implemented Machine Learning pipeline. Datasets are described with blue rectangles. Methods and algorithms are represented with rounded orange rectangles. Arrows beginning from a dataset to a method/algorithm indicate that the data set is passed as input to that method/algorithm. Arrows starting from a method/algorithm show that the output is produced and can be fed to another step.
Fig. 2
Fig. 2
Feature importance plot for the dietary data set. Red dots represent high values of the current feature in the used data set. Blue dots represent low values of the current feature in the used data set. Values on the x-axis represent the magnitude and sign of the impact that each value of the feature has on predicting the target variable COVID-19 mortality.
Fig. 3
Fig. 3
Feature importance plot for fusing the dietary and the geo-economic data set. Red dots represent high values of the current feature in the used data set. Blue dots represent low values of the current feature in the used data set. Values on the x-axis represent the magnitude and sign of the impact that each value of the feature has on predicting the target variable COVID-19 mortality.
Fig. 4
Fig. 4
Feature importance plot for the fusion of the full data set. Red dots represent high values of the current feature in the used data set. Blue dots represent low values of the current feature in the used data set. Values on the x-axis represent the magnitude and sign of the impact that each value of the feature has on the prediction of the target variable COVID-19 mortality.
Fig. 5
Fig. 5
Feature impact analysis for North Macedonia. Values on the x-axis represent the magnitude and sign of the impact that each value of the feature has on the prediction of the target variable COVID-19 mortality for the specific country. Red arrows pointing right represent high values of the current feature for the country. Blue arrows pointing left represent low values of the current feature for the country.
Fig. 6
Fig. 6
Feature impact analysis for South Africa. Values on the x-axis represent the magnitude and sign of the impact that each value of the feature has on the prediction of the target variable COVID-19 mortality for the specific country. Red arrows pointing right represent high values of the current feature for the country. Blue arrows pointing left represent low values of the current feature for the country.
Fig. 7
Fig. 7
Self-organizing map clusters for the dietary data set. Countries colored in green have a low COVID-19 mortality rate. Countries colored in yellow have an average COVID-19 mortality rate. Countries colored in red have a high COVID-19 mortality rate. Countries belonging to the same square are treated as being the same. Squares that are close to each other are evaluated to have similar values of the recorded features.
Fig. 8
Fig. 8
Decision map for the SOM clusters of the dietary data set. The squares in the figure correspond to the ones in Fig. 7. These squares represent the most dominant feature in the decision for clustering the countries.
Fig. 9
Fig. 9
Regression analysis prediction distribution for the dietary data set. The figure is divided into two subplots for better visual representation. The predictions are ordered by the error in increasing order. The first subplot contains the more accurately predicted COVID-19 mortality rates, and the second subplot contains the less accurately predicted COVID-19 mortality rates. Each point on the x-axis corresponds to exactly-one country. The orange bars represent the actual value of the COVID-19 mortality rate in percent of the total population for the current country on the x-axis. The blue bars represent the predicted value of the COVID-19 mortality rate in percent of the total population for the current country on the x-axis.
Fig. 10
Fig. 10
Regression analysis prediction distribution for the fused data set. The figure is divided into two subplots for better visual representation. The predictions are ordered by the error in increasing order. The first subplot contains the more accurately predicted COVID-19 mortality rates, and the second subplot contains the less accurately predicted COVID-19 mortality rates. Each point on the x-axis corresponds to exactly-one country. The orange bars represent the actual value of the COVID-19 mortality rate in percent of the total population for the current country on the x-axis. The blue bars represent the predicted value of the COVID-19 mortality rate in percent of the total population for the current country on the x-axis.

Similar articles

Cited by

References

    1. Barda N., Riesel D., Akriv A., Levy J., Finkel U., Yona G., Dagan N. Developing a COVID-19 mortality risk prediction model when individual-level data are not available. Nature communications. 2020:1–9. - PMC - PubMed
    1. Berkeley Earth, B. E. (2016). Data Overview. Retrieved May, 2020 from Berkeley Earth: http://berkeleyearth.org/data/.
    1. Bertsimas, D., Lukin, G., Mingardi, L., Nohadani, O., Orfanoudaki, A., Stellato, B., & Group, H. C.-1. (2020). COVID-19 mortality risk assessment: An international multi-center study. PloS one, e0243262. - PMC - PubMed
    1. Butler M.J., Barrientos R.M. The impact of nutrition on COVID-19 susceptibility and long-term consequences. Brain, behavior, and immunity. 2020:53–54. - PMC - PubMed
    1. Caramelo F., Ferreira N., Oliveiros B. Estimation of risk factors for COVID-19 mortality-preliminary results. MedRxiv. 2020

LinkOut - more resources