Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 2;23(1):224.
doi: 10.1186/s12889-023-15106-y.

Exploring predictors of welfare dependency 1, 3, and 5 years after mental health-related absence in danish municipalities between 2010 and 2012 using flexible machine learning modelling

Affiliations

Exploring predictors of welfare dependency 1, 3, and 5 years after mental health-related absence in danish municipalities between 2010 and 2012 using flexible machine learning modelling

Søren Skotte Bjerregaard. BMC Public Health. .

Abstract

Background: Using XGBoost (XGB), this study demonstrates how flexible machine learning modelling can complement traditional statistical modelling (multinomial logistic regression) as a sensitivity analysis and predictive modelling tool in occupational health research.

Design: The study predicts welfare dependency for a cohort at 1, 3, and 5 years of follow-up using XGB and multinomial logistic regression (MLR). The models' predictive ability is evaluated using tenfold cross-validation (internal validation) and geographical validation (semi-external validation). In addition, we calculate and graphically assess Shapley additive explanation (SHAP) values from the XGB model to examine deviation from linearity assumptions, including interactions. The study population consists of all 20-54 years old on long-term sickness absence leave due to self-reported common mental disorders (CMD) between April 26, 2010, and September 2012 in 21 (of 98) Danish municipalities that participated in the Danish Return to Work program. The total sample of 19.664 observations is split geospatially into a development set (n = 9.756) and a test set (n = 9.908).

Results: There were no practical differences in the XGB and MLR models' predictive ability. Industry, job skills, citizenship, unemployment insurance, gender, and period had limited importance in predicting welfare dependency in both models. On the other hand, welfare dependency history and reason for sickness absence were strong predictors. Graphical SHAP-analysis of the XGB model did not indicate substantial deviations from linearity assumptions implied by the multinomial regression model.

Conclusion: Flexible machine learning models like XGB can supplement traditional statistical methods like multinomial logistic regression in occupational health research by providing a benchmark for predictive performance and traditional statistical models' ability to capture important associations for a given set of predictors as well as potential violations of linearity.

Trial registration: ISRCTN43004323.

Keywords: Common mental disorders; Machine learning; Return to work; Shapley additive explanation; Welfare dependency; XGboost.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Participant flow diagram
Fig. 2
Fig. 2
Calibration (agreement between observed and predicted probability smoothed using cubic splines)
Fig. 3
Fig. 3
Predictor importance for MLR
Fig. 4
Fig. 4
Predictor importance for XGB at 1 year follow up
Fig. 5
Fig. 5
SHAP dependence plots at 1 year follow-up
Fig. 6
Fig. 6
SHAP values by outcome and job skill level for self-reported “mental ill-health without further specification”

References

    1. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer; 2017.
    1. Mooney SJ, Pejaver V. Big Data in Public Health: Terminology, Machine Learning, and Privacy. Annu Rev Public Health. 2018;39:95–112. doi: 10.1146/annurev-publhealth-040617-014208. - DOI - PMC - PubMed
    1. Steyerberg E. Clinical Prediction models - a practical approach to development, validation, and updating. Cham: Springer; 2019.
    1. Kuhn M, Johnson K. Applied Predictive Modeling. New York: Springer; 2013.
    1. Shmueli G. To Explain or to Predict? Stat Sci. 2010;25(3):289–310. doi: 10.1214/10-STS330. - DOI

Publication types