Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov-Dec;56(6):829-852.
doi: 10.1080/00273171.2020.1808437. Epub 2020 Aug 28.

Random Forests Approach for Causal Inference with Clustered Observational Data

Affiliations

Random Forests Approach for Causal Inference with Clustered Observational Data

Youmi Suk et al. Multivariate Behav Res. 2021 Nov-Dec.

Abstract

There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms preexisting methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.

Keywords: Causal inference; hierarchical linear modeling; machine learning methods; multilevel observational data; multilevel propensity score matching.

PubMed Disclaimer

Similar articles

Cited by

LinkOut - more resources