Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021;116(533):368-381.
doi: 10.1080/01621459.2020.1753522. Epub 2020 Jun 8.

Robust Q-learning

Affiliations

Robust Q-learning

Ashkan Ertefaie et al. J Am Stat Assoc. 2021.

Abstract

Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or efficiency loss. We propose a robust Q-learning approach which allows estimating such nuisance parameters using data-adaptive techniques. We study the asymptotic behavior of our estimators and provide simulation studies that highlight the need for and usefulness of the proposed method in practice. We use the data from the "Extending Treatment Effectiveness of Naltrexone" multi-stage randomized trial to illustrate our proposed methods.

Keywords: Cross-fitting; Data-adaptive techniques; Dynamic treatment strategies; Residual confounding.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
ExTENd study design. The ® notation represents instances of randomization; the N values in this figure represent the subsequent number of patients assigned to each treatment option.
Figure 2:
Figure 2:
ExTENd study. Covariate imbalance across different treatment groups. A1: stage 1 treatment option; A2NR: stage 2 treatment options among non-responders; A2R: stage 2 treatment options among responders. The dashed vertical lines show cut points at ± 0.2.

References

    1. AUSTIN PC (2009). Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research. Communications in Statistics-Simulation and Computation 38, 1228–1234.
    1. BAI X, TSIATIS AA & O’BRIEN SM (2013). Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling. Biometrics 69, 830–839. - PMC - PubMed
    1. BENKESER D, CARONE M, VAN DER LAAN M. & GILBERT P. (2017). Doubly robust nonparametric inference on the average treatment effect. Biometrika 104, 863–880. - PMC - PubMed
    1. BERK R, BROWN L, BUJA A, ZHANG K, ZHAO L. et al. (2013). Valid post-selection inference. Annals of Statistics 41, 802–837.
    1. Butler EL, Laber EB, Davis SM & Kosorok MR (2018). Incorporating patient preferences into estimation of optimal individualized treatment rules. Biometrics 74, 18–26. - PMC - PubMed

LinkOut - more resources