Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun;21(2):206-228.
doi: 10.1007/s10742-020-00222-8. Epub 2020 Oct 20.

Veridical Causal Inference using Propensity Score Methods for Comparative Effectiveness Research with Medical Claims

Affiliations

Veridical Causal Inference using Propensity Score Methods for Comparative Effectiveness Research with Medical Claims

Ryan D Ross et al. Health Serv Outcomes Res Methodol. 2021 Jun.

Abstract

Medical insurance claims are becoming increasingly common data sources to answer a variety of questions in biomedical research. Although comprehensive in terms of longitudinal characterization of disease development and progression for a potentially large number of patients, population-based inference using these datasets require thoughtful modifications to sample selection and analytic strategies relative to other types of studies. Along with complex selection bias and missing data issues, claims-based studies are purely observational, which limits effective understanding and characterization of the treatment differences between groups being compared. All these issues contribute to a crisis in reproducibility and replication of comparative findings using medical claims. This paper offers practical guidance to the analytical process, demonstrates methods for estimating causal treatment effects with propensity score methods for several types of outcomes common to such studies, such as binary, count, time to event and longitudinally-varying measures, and also aims to increase transparency and reproducibility of reporting of results from these investigations. We provide an online version of the paper with readily implementable code for the entire analysis pipeline to serve as a guided tutorial for practitioners. The online version can be accessed at https://rydaro.github.io/. The analytic pipeline is illustrated using a sub-cohort of patients with advanced prostate cancer from the large Clinformatics TM Data Mart Database (OptumInsight, Eden Prairie, Minnesota), consisting of 73 million distinct private payer insurees from 2001-2016.

Keywords: average treatment effect; covariate adjustment; hormone therapy; insurance claims; matching; prostate cancer; reproducibility; sensitivity analysis; veridical data science.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: The authors have no competing interests and nothing to disclose.

Figures

Figure 1:
Figure 1:
Comparative Effectiveness Data Analysis Pipeline Flow Diagram. Gold pathway indicates steps done in iteration until acceptable balance is achieved.
Figure 2:
Figure 2:
Balance Diagnostics Plot Standardized differences shown for each confounder variable. Vertical dotted lines indicate the desired balance level. Differences shown for the observed data, after matching, and weighting with both calculated propensity scores (logistic regression and CBPS)
Figure 3:
Figure 3:
Visualized Sensitivity Analysis Four sensitivity analyses for four ATE estimates are shown. Contours over scatterplot show the entire distribution of ATE and associated p-values for the set of plausible propensity score models. Dashed lines show denoted percentiles cutoffs for this distribution. K denotes number of covariates in shown model, and the dotted line plot shows median ATE and p-value for each set of K covariates from K=1 to K=12. Thick solid line indicates significance threshold of alpha=0.05. E-values for the full model (K=12) are listed in caption.

Similar articles

Cited by

References

    1. Ali M. Sanni, Groenwold Rolf H.H., Belitser Svetlana V., Pestman Wiebe R., Hoes Arno W., Roes Kit C.B., de Boer Anthonius, and Klungel Olaf H.. 2015. “Reporting of Covariate Selection and Balance Assessment in Propensity Score Analysis Is Suboptimal: A Systematic Review.” Journal of Clinical Epidemiology 68 (2): 122–31. 10.1016/J.JCLINEPI.2014.08.011. - DOI - PubMed
    1. Andersen Robert.: Modern Methods for Robust Regression 1–6. (2019)
    1. Andersen PK, Perme MP. Pseudo-observations in survival analysis. Stat Methods Med Res. 2010;19(1):71–99. doi:10.1177/0962280209105020 - DOI - PubMed
    1. Austin Peter C. 2008a. “A Critical Appraisal of Propensity-Score Matching in the Medical Literature between 1996 and 2003.” Statistics in Medicine 27 (12): 2037–49. 10.1002/sim.3150. - DOI - PubMed
    1. Austin Peter C. 2009a. “Balance Diagnostics for Comparing the Distribution of Baseline Covariates between Treatment Groups in Propensity-Score Matched Samples.” Statistics in Medicine 28 (25): 3083–3107. 10.1002/sim.3697. - DOI - PMC - PubMed

LinkOut - more resources