Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct;39(10):1097-1108.
doi: 10.1007/s10654-024-01173-x. Epub 2024 Nov 13.

Machine learning in causal inference for epidemiology

Affiliations

Machine learning in causal inference for epidemiology

Chiara Moccia et al. Eur J Epidemiol. 2024 Oct.

Abstract

In causal inference, parametric models are usually employed to address causal questions estimating the effect of interest. However, parametric models rely on the correct model specification assumption that, if not met, leads to biased effect estimates. Correct model specification is challenging, especially in high-dimensional settings. Incorporating Machine Learning (ML) into causal analyses may reduce the bias arising from model misspecification, since ML methods do not require the specification of a functional form of the relationship between variables. However, when ML predictions are directly plugged in a predefined formula of the effect of interest, there is the risk of introducing a "plug-in bias" in the effect measure. To overcome this problem and to achieve useful asymptotic properties, new estimators that combine the predictive potential of ML and the ability of traditional statistical methods to make inference about population parameters have been proposed. For epidemiologists interested in taking advantage of ML for causal inference investigations, we provide an overview of three estimators that represent the current state-of-art, namely Targeted Maximum Likelihood Estimation (TMLE), Augmented Inverse Probability Weighting (AIPW) and Double/Debiased Machine Learning (DML).

Keywords: Causal inference; Doubly-robustness; Machine learning; Targeted learning.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval: Not applicable. Competing interests: The authors have no relevant financial or non-financial interests to disclose.

Figures

Fig. 1
Fig. 1
Visual synthesis of the article. In A, the different steps of a causal inference framework. In B, estimators for causal effect that integrate Machine Learning methods, bridging the gap between statistical inference and Machine Learning

References

    1. Adlung L, Cohen Y, Mor U, Elinav E. Machine learning in clinical decision making. Med. 2021;2(6):642–65. - PubMed
    1. Kino S, Hsu YT, Shiba K, Chien YS, Mita C, Kawachi I, Daoud A. A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects. SSM-population Health. 2021;15:100836. - PMC - PubMed
    1. van Boven MR, Henke CE, Leemhuis AG, Hoogendoorn M, van Kaam AH, Königs M, Oosterlaan J. (2022). Machine learning prediction models for neurodevelopmental outcome after preterm birth: a scoping review and new machine learning evaluation framework. Pediatrics, 150(1), e2021056052. - PubMed
    1. Naimi AI, Cole SR, Kennedy EH. An introduction to g methods. Int J Epidemiol. 2017;46(2):756–62. - PMC - PubMed
    1. Kennedy EH. (2022). Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469.

LinkOut - more resources