Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 14;13(1):6121.
doi: 10.1038/s41598-023-33033-1.

A machine learning approach to predict self-protecting behaviors during the early wave of the COVID-19 pandemic

Affiliations

A machine learning approach to predict self-protecting behaviors during the early wave of the COVID-19 pandemic

Alemayehu D Taye et al. Sci Rep. .

Abstract

Using a unique harmonized real-time data set from the COME-HERE longitudinal survey that covers five European countries (France, Germany, Italy, Spain, and Sweden) and applying a non-parametric machine learning model, this paper identifies the main individual and macro-level predictors of self-protecting behaviors against the coronavirus disease 2019 (COVID-19) during the first wave of the pandemic. Exploiting the interpretability of a Random Forest algorithm via Shapely values, we find that a higher regional incidence of COVID-19 triggers higher levels of self-protective behavior, as does a stricter government policy response. The level of individual knowledge about the pandemic, confidence in institutions, and population density also ranks high among the factors that predict self-protecting behaviors. We also identify a steep socioeconomic gradient with lower levels of self-protecting behaviors being associated with lower income and poor housing conditions. Among socio-demographic factors, gender, marital status, age, and region of residence are the main determinants of self-protective measures.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Top 30 predictors of self-protecting behaviors. Notes: Panel (a) is the SHAP summary plot for the Random Forests trained on the pooled data set of five European countries to predict self-protecting behaviors responses against COVID-19. The plot displays the top 30 features on prediction (the top on the y-axis is the most important) and the distribution of the impacts of each predictor on the model prediction, which includes a set of distributions where each dot corresponds to an individual. When multiple dots arrive at the same coordinate in the plot, they pile up to show density. The colors correspond to the feature values: red for larger values and blue for smaller ones. A negative SHAP value (extending to the left) shows reduced self-protecting behavior, while a positive (extending to the right) shows an increased self-protecting behavior. Panel (b) displays threefold information: (i) the direction of association captured by the correlation between the feature and SHAP values (red for positive and blue for negative); (ii) strength of the direction of association shown by the darkness of each color gradient; (iii) the magnitude of feature’s marginal impact measured as the average of absolute SHAP values.
Figure 2
Figure 2
Partial effects of stringency policy response and local infection rate. Notes: This figure displays SHAP dependence boxplots of the stringency index. The diamond symbol in the boxes denotes the average of the SHAP value distribution per each value of the stringency index during the first wave of the pandemic. In panel (d), the labels in the x-axis correspond to the number of people in the respondent’s residential area, (1) “isolated dwelling”, (2) “less than 2000”, (3) “between 10,000 and 2000”, (4) “between 50,000 and 10,000”, (5) “between 100,000 and 50,000”, and (6) “more than 100,000”. The values in the x-axis of panels (b) and (c) are the number of deaths and cases summarized in a few bins.
Figure 3
Figure 3
Partial effect of confidence in institutions and level of knowledge about COVID-19. Notes: Each panel (a)–(d) displays SHAP dependence boxplots of features related to trust in institutions and knowledge about COVID-19. The diamond symbol in the boxes denotes the average of SHAP value distribution per each category.
Figure 4
Figure 4
Partial effects of income and housing features. Notes: Each panel (a)–(c) display SHAP dependence boxplots of income and housing features. The diamond symbol in the boxes denotes the average of SHAP value distribution per each category.
Figure 5
Figure 5
Partial effects of pre-existing health conditions and behavioral risk factors. Notes: Each panel (a)–(f) display SHAP dependence boxplots of pre-existing health conditions and behavioral risk factors. The diamond symbol in the boxes denotes the average of SHAP value distribution per each category. In panel (d), the labels in the x-axis correspond to the number alcohol consumption in number of glasses in an average week: (1) “ < 5”, (2) “[5, 10)”, (3) “[10, 15)”, (4) “[15, 20)”, (5) “[20,25)”, (6) “ ≥ 25”.
Figure 6
Figure 6
Partial effects of socio-demographic factors. Notes: Each panel (a)–(d) display SHAP dependence boxplots of socio-demographic features. The diamond symbol in the boxes denotes the average of SHAP value distribution per each category.
Figure 7
Figure 7
Feature interaction effects. Notes: Each panel (a)–(c) displays SHAP feature dependence plots of the RF model with the largest interaction effect. Artificial jitter (0.5) was added along the x-axis to better show the overlapping distribution of the points.

References

    1. Dergaa I, Varma A, Tabben M, Malik RA, Sheik S, Vedasalam S, Abbassi AK, Almulla J, Chaabane M, Chamari K. Organising football matches with spectators during the COVID-19 pandemic: What can we learn from the Amir Cup Football Final of Qatar 2020? A call for action. Biol. Sport. 2021;38:677–681. doi: 10.5114/biolsport.2021.103568. - DOI - PMC - PubMed
    1. Musa S, Elyamani R, Dergaa I. COVID-19 and screen-based sedentary behaviour: Systematic review of digital screen time and metabolic syndrome in adolescents. PLoS ONE. 2022;17:e0265560. doi: 10.1371/journal.pone.0265560. - DOI - PMC - PubMed
    1. Trabelsi K, Ammar A, Masmoudi L, Boukhris O, Chtourou H, Bouaziz B, Brach M, Bentlage E, How D, Ahmed M. Globally altered sleep patterns and physical activity levels by confinement in 5056 individuals: ECLB COVID-19 international online survey. Biol. Sport. 2021;38:495–506. doi: 10.5114/biolsport.2021.101605. - DOI - PMC - PubMed
    1. Hale T, Angrist N, Goldszmidt R, Kira B, Petherick A, Phillips T, Webster S, Cameron-Blake E, Hallas L, Majumdar S, et al. A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker) Nat. Hum. Behav. 2021;5:529–538. doi: 10.1038/s41562-021-01079-8. - DOI - PubMed
    1. Aubert C, Augeraud-Véron E. The relative power of individual distancing efforts and public policies to curb the COVID-19 epidemics. PLoS ONE. 2021;16:e0250764. doi: 10.1371/journal.pone.0250764. - DOI - PMC - PubMed

Publication types