Machine learning in causal inference for epidemiology
- PMID: 39535572
- PMCID: PMC11599438
- DOI: 10.1007/s10654-024-01173-x
Machine learning in causal inference for epidemiology
Abstract
In causal inference, parametric models are usually employed to address causal questions estimating the effect of interest. However, parametric models rely on the correct model specification assumption that, if not met, leads to biased effect estimates. Correct model specification is challenging, especially in high-dimensional settings. Incorporating Machine Learning (ML) into causal analyses may reduce the bias arising from model misspecification, since ML methods do not require the specification of a functional form of the relationship between variables. However, when ML predictions are directly plugged in a predefined formula of the effect of interest, there is the risk of introducing a "plug-in bias" in the effect measure. To overcome this problem and to achieve useful asymptotic properties, new estimators that combine the predictive potential of ML and the ability of traditional statistical methods to make inference about population parameters have been proposed. For epidemiologists interested in taking advantage of ML for causal inference investigations, we provide an overview of three estimators that represent the current state-of-art, namely Targeted Maximum Likelihood Estimation (TMLE), Augmented Inverse Probability Weighting (AIPW) and Double/Debiased Machine Learning (DML).
Keywords: Causal inference; Doubly-robustness; Machine learning; Targeted learning.
© 2024. The Author(s).
Conflict of interest statement
Declarations. Ethics approval: Not applicable. Competing interests: The authors have no relevant financial or non-financial interests to disclose.
Figures
References
-
- Adlung L, Cohen Y, Mor U, Elinav E. Machine learning in clinical decision making. Med. 2021;2(6):642–65. - PubMed
-
- van Boven MR, Henke CE, Leemhuis AG, Hoogendoorn M, van Kaam AH, Königs M, Oosterlaan J. (2022). Machine learning prediction models for neurodevelopmental outcome after preterm birth: a scoping review and new machine learning evaluation framework. Pediatrics, 150(1), e2021056052. - PubMed
-
- Kennedy EH. (2022). Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous
