Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;56(5):932-941.
doi: 10.1111/1475-6773.13666. Epub 2021 May 12.

Confounding and regression adjustment in difference-in-differences studies

Affiliations

Confounding and regression adjustment in difference-in-differences studies

Bret Zeldow et al. Health Serv Res. 2021 Oct.

Abstract

Objective: To define confounding bias in difference-in-difference studies and compare regression- and matching-based estimators designed to correct bias due to observed confounders.

Data sources: We simulated data from linear models that incorporated different confounding relationships: time-invariant covariates with a time-varying effect on the outcome, time-varying covariates with a constant effect on the outcome, and time-varying covariates with a time-varying effect on the outcome. We considered a simple setting that is common in the applied literature: treatment is introduced at a single time point and there is no unobserved treatment effect heterogeneity.

Study design: We compared the bias and root mean squared error of treatment effect estimates from six model specifications, including simple linear regression models and matching techniques.

Data collection: Simulation code is provided for replication.

Principal findings: Confounders in difference-in-differences are covariates that change differently over time in the treated and comparison group or have a time-varying effect on the outcome. When such a confounding variable is measured, appropriately adjusting for this confounder (ie, including the confounder in a regression model that is consistent with the causal model) can provide unbiased estimates with optimal SE. However, when a time-varying confounder is affected by treatment, recovering an unbiased causal effect using difference-in-differences is difficult.

Conclusions: Confounding in difference-in-differences is more complicated than in cross-sectional settings, from which techniques and intuition to address observed confounding cannot be imported wholesale. Instead, analysts should begin by postulating a causal model that relates covariates, both time-varying and those with time-varying effects on the outcome, to treatment. This causal model will then guide the specification of an appropriate analytical model (eg, using regression or matching) that can produce unbiased treatment effect estimates. We emphasize the importance of thoughtful incorporation of covariates to address confounding bias in difference-in-difference studies.

Keywords: difference-in-differences; matching; parallel trends; regression adjustment; time-varying confounding.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Adjusting for the main effect of a covariate does not correct for diverging trends, but adjusting for its interaction with time does. Legend: In this simulated example, untreated potential outcomes depend on a time‐invariant covariate with a time‐varying effect. Panel A shows mean untreated potential outcomes by group. Panels B to D show residuals from linear models, denoted using pseudo‐code for the function lm, which fits a linear model for outcome y. In panel B, the only predictor is time. In panel C, the predictors are time and the covariate x. In panel D, the predictors are time, the covariate, and their interaction [Color figure can be viewed at wileyonlinelibrary.com]
FIGURE 2
FIGURE 2
Simulation results for a time‐invariant covariate. Legend: Six regression and matching methods were compared across three simulation scenarios. Each panel shows results from 400 simulated datasets of 800 units each. In Scenario 1, the distribution of the covariate varied by treatment group but the covariate's effect on the outcome did not change (ie, no interaction between the covariate and time). In Scenario 2, the covariate's effect on the outcome changed over time. In the third scenario, the distribution of the covariate was the same in the treated and comparison groups, and the covariate's effect on the outcome changed over time. All analyses were assessed on the mean percent bias and mean standard error (SE) of the effect estimate. CA = Covariate‐adjusted; TVA = Time‐varying adjusted
FIGURE 3
FIGURE 3
Simulation results for a time‐varying covariate with a time‐invariant effect on the outcome. Legend: Six regression and matching methods were compared across three simulation scenarios. Each panel shows results from 400 simulated datasets of 800 units each. For all scenarios, the covariate's effect on the outcome was constant over time. In Scenario 4a, the time‐varying covariate evolved in the same way for the treated and comparison group. In Scenario 5a, the covariate evolved differently between the two groups starting from the first timepoint (before treatment was implemented). In Scenario 6a, the covariate evolved the same prior to treatment. Once treatment was implemented, evolution of the covariate diverged relative to the two groups. All analyses were assessed on the mean percent bias and mean standard error (SE) of the effect estimate. CA = Covariate adjusted; TVA = Time‐varying adjusted
FIGURE 4
FIGURE 4
Simulation results for a time‐varying covariate with a time‐varying effect on the outcome. Legend: Six regression and matching methods were compared across three simulation scenarios. Each panel shows results from 400 simulated datasets of 800 units each. For all scenarios, the covariate's effect on the outcome differed across time. In Scenario 4b, the time‐varying covariate evolved in the same way for the treated and comparison group. In Scenario 5b, the covariate evolved differently between the two groups starting from the first timepoint (before treatment was implemented). In Scenario 6b, the covariate evolved the same prior to treatment. Once treatment was implemented, evolution of the covariate diverged relative to the two groups. All analyses were assessed on the mean percent bias and mean standard error (SE) of the effect estimate. CA = Covariate adjusted; TVA = Time‐varying adjusted

References

    1. National Federation of Independent Business v. Sebelius. (2011). www.oyez.org/cases/2011/11-393
    1. Antonisse L, Garfield R, Rudowitz R, Artiga S. The effects of Medicaid expansion under the ACA: updated findings from a literature review. Published 2018. https://www.kff.org/medicaid/issue-brief/the-effects-of-medicaid-expansi...
    1. VanderWeele TJ, Shpitser I. On the definition of a confounder. Ann Stat. 2013;41(1):196‐220. 10.1214/12-AOS1058. - DOI - PMC - PubMed
    1. Abadie A. Semiparametric difference‐in‐differences estimators. Rev Econ Stud. 2005;72:1‐19. 10.1111/0034-6527.00321. - DOI
    1. Bilinski A, Hatfield LA. Seeking evidence of absence: Reconsidering tests of model assumptions. ArXiv180503273 Stat. Published online May 8, 2018. Accessed July 23, 2018. http://arxiv.org/abs/1805.03273

Publication types

LinkOut - more resources