Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jun 15;163(12):1149-56.
doi: 10.1093/aje/kwj149. Epub 2006 Apr 19.

Variable selection for propensity score models

Affiliations

Variable selection for propensity score models

M Alan Brookhart et al. Am J Epidemiol. .

Abstract

Despite the growing popularity of propensity score (PS) methods in epidemiology, relatively little has been written in the epidemiologic literature about the problem of variable selection for PS models. The authors present the results of two simulation studies designed to help epidemiologists gain insight into the variable selection problem in a PS analysis. The simulation studies illustrate how the choice of variables that are included in a PS model can affect the bias, variance, and mean squared error of an estimated exposure effect. The results suggest that variables that are unrelated to the exposure but related to the outcome should always be included in a PS model. The inclusion of these variables will decrease the variance of an estimated exposure effect without increasing bias. In contrast, including variables that are related to the exposure but not to the outcome will increase the variance of the estimated exposure effect without decreasing bias. In very small studies, the inclusion of variables that are strongly related to the exposure but only weakly related to the outcome can be detrimental to an estimate in a mean squared error sense. The addition of these variables removes only a small amount of bias but can increase the variance of the estimated exposure effect. These simulation studies and other analytical results suggest that standard model-building tools designed to create good predictive models of the exposure will not always lead to optimal PS models, particularly in small studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The causal diagram for Simulation Experiment 1.
Figure 2
Figure 2
Variance of unadjusted estimator γ^0 and PS adjusted estimator γ^1 for different values of β1 for n = 500 and n = 2500.
Figure 3
Figure 3
Contours of the MSE of the PS adjusted estimator relative to the unadjusted estimator, MSE(γ^1)/MSE(γ^0).

Comment in

Similar articles

Cited by

References

    1. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;79:516–524.
    1. Rubin DB, Thomas N. Matching using estimated propensity score: relating theory to practice. Biometrics. 1996;52:249–264. - PubMed
    1. Rubin DB. Estimating causal effects from large data sets using the propensity score. Ann Intern Med. 1997;127:757–763. - PubMed
    1. Perkins SM, Tu W, Underhill MG, Zhou XH, Murray MD. The use of propensity scores in pharmacoepidemiologic research. Phamacoepidemiolog Drug Saf. 2000;9:93–101. - PubMed
    1. Robins JM, Mark SD, Newey WK. Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics. 1992;48:479–495. - PubMed

Publication types