Thirteen Questions About Using Machine Learning in Causal Research (You Won't Believe the Answer to Number 10!)

Stephen J Mooney, Alexander P Keil, Daniel J Westreich

PMID: 33751024
PMCID: PMC8555423
DOI: 10.1093/aje/kwab047

Thirteen Questions About Using Machine Learning in Causal Research (You Won't Believe the Answer to Number 10!)

Stephen J Mooney et al. Am J Epidemiol. 2021.

. 2021 Aug 1;190(8):1476-1482.

doi: 10.1093/aje/kwab047.

Authors

Stephen J Mooney, Alexander P Keil, Daniel J Westreich

PMID: 33751024
PMCID: PMC8555423
DOI: 10.1093/aje/kwab047

Abstract

Machine learning is gaining prominence in the health sciences, where much of its use has focused on data-driven prediction. However, machine learning can also be embedded within causal analyses, potentially reducing biases arising from model misspecification. Using a question-and-answer format, we provide an introduction and orientation for epidemiologists interested in using machine learning but concerned about potential bias or loss of rigor due to use of "black box" models. We conclude with sample software code that may lower the barrier to entry to using these techniques.

Keywords: causal inference; double-robustness; epidemiologic methods; inverse probability weighting; machine learning; propensity score; targeted maximum likelihood estimation.

© The Author(s) 2021. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

PubMed Disclaimer

Comment in

Invited Commentary: Machine Learning in Causal Inference-How Do I Love Thee? Let Me Count the Ways.
Balzer LB, Petersen ML. Balzer LB, et al. Am J Epidemiol. 2021 Aug 1;190(8):1483-1487. doi: 10.1093/aje/kwab048. Am J Epidemiol. 2021. PMID: 33751059

References

1. Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337–346. - PMC - PubMed
1. Westreich D, Lessler J, Funk MJ. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol. 2010;63(8):826–833. - PMC - PubMed
1. Mooney SJ, Pejaver V. Big data in public health: terminology, machine learning, and privacy. Annu Rev Public Health. 2018;39:95–112. - PMC - PubMed
1. Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2007;11(2):59–77. - PMC - PubMed
1. Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning. New York, NY: Springer Publishing Company; 2001.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Thirteen Questions About Using Machine Learning in Causal Research (You Won't Believe the Answer to Number 10!)

Thirteen Questions About Using Machine Learning in Causal Research (You Won't Believe the Answer to Number 10!)

Authors

Abstract

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources