Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 1;18(3):553-568.
doi: 10.1093/biostatistics/kxx003.

Guided Bayesian imputation to adjust for confounding when combining heterogeneous data sources in comparative effectiveness research

Affiliations

Guided Bayesian imputation to adjust for confounding when combining heterogeneous data sources in comparative effectiveness research

Joseph Antonelli et al. Biostatistics. .

Abstract

In comparative effectiveness research, we are often interested in the estimation of an average causal effect from large observational data (the main study). Often this data does not measure all the necessary confounders. In many occasions, an extensive set of additional covariates is measured for a smaller and non-representative population (the validation study). In this setting, standard approaches for missing data imputation might not be adequate due to the large number of missing covariates in the main data relative to the smaller sample size of the validation data. We propose a Bayesian approach to estimate the average causal effect in the main study that borrows information from the validation study to improve confounding adjustment. Our approach combines ideas of Bayesian model averaging, confounder selection, and missing data imputation into a single framework. It allows for different treatment effects in the main study and in the validation study, and propagates the uncertainty due to the missing data imputation and confounder selection when estimating the average causal effect (ACE) in the main study. We compare our method to several existing approaches via simulation. We apply our method to a study examining the effect of surgical resection on survival among 10 396 Medicare beneficiaries with a brain tumor when additional covariate information is available on 2220 patients in SEER-Medicare. We find that the estimated ACE decreases by 30% when incorporating additional information from SEER-Medicare.

Keywords: Bayesian adjustment for confounding; Bayesian data augmentation; Confounder selection; Missing data; Model averaging.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Bias, MSE, and interval coverage of the various estimators across 1000 simulations. formula image, formula image.
Fig. 2.
Fig. 2.
Estimated formula image for each of the 50 covariates that can potentially enter into the outcome model, for GBAC(formula image) and GBAC(1). The points in black correspond to GBAC(formula image), while those in grey correspond to GBAC(1). Squares represent the true confounders (formula image), while circles represent covariates that are noise. Points to the left of the dotted line (indices 1–5) are covariates that are fully observed, while those to the right (indices 6–50) are only observed in the validation study).
Fig. 3.
Fig. 3.
Estimates and 95% posterior credible intervals for the average causal effect of surgical resection on the probability of 30 day survival in the Medicare population.

References

    1. Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data.. Journal of the American Statistical Association, 88, 669–679.
    1. Breslow, N. E, Lumley, T., Ballantyne, C. M., Chambless, L. E. and Kulich, M. (2009). Improved Horvitz–Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology.. Statistics in Biosciences, 1, 32–49. - PMC - PubMed
    1. Carroll, R. J, Ruppert, D., Stefanski, L. A and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models: A Modern Perspective. Boca Raton, Florida: CRC Press.
    1. Chaichana, K. L, Garzon-Muvdi, T., Parker, S., Weingart, J. D, Olivi, A., Bennett, R., Brem, H. and Quinones-Hinojosa, A. (2011). Supratentorial glioblastoma multiforme: the role of surgical resection versus biopsy among older patients.. Annals of Surgical Oncology, 18, 239–245. - PMC - PubMed
    1. Chatterjee, N., Chen, Y. H., Maas, P. and Carroll, R. J. (2015). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources.. Journal of the American Statistical Association, 111, 1–32. - PMC - PubMed