Robust Q-learning

Ashkan Ertefaie¹, James R McKay², David Oslin³, Robert L Strawderman¹

Affiliations

¹ Department of Biostatistics and Computational Biology, University of Rochester.
² Center on the Continuum of Care in the Addictions, Department of Psychiatry, University of Pennsylvania.
³ Philadelphia Veterans Administration Medical Center, and Treatment Research Center and Center for Studies of Addictions, Department of Psychiatry, University of Pennsylvania.

PMID: 34121784
PMCID: PMC8190585
DOI: 10.1080/01621459.2020.1753522

Robust Q-learning

Ashkan Ertefaie et al. J Am Stat Assoc. 2021.

. 2021;116(533):368-381.

doi: 10.1080/01621459.2020.1753522. Epub 2020 Jun 8.

Authors

Ashkan Ertefaie¹, James R McKay², David Oslin³, Robert L Strawderman¹

Affiliations

¹ Department of Biostatistics and Computational Biology, University of Rochester.
² Center on the Continuum of Care in the Addictions, Department of Psychiatry, University of Pennsylvania.
³ Philadelphia Veterans Administration Medical Center, and Treatment Research Center and Center for Studies of Addictions, Department of Psychiatry, University of Pennsylvania.

PMID: 34121784
PMCID: PMC8190585
DOI: 10.1080/01621459.2020.1753522

Abstract

Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or efficiency loss. We propose a robust Q-learning approach which allows estimating such nuisance parameters using data-adaptive techniques. We study the asymptotic behavior of our estimators and provide simulation studies that highlight the need for and usefulness of the proposed method in practice. We use the data from the "Extending Treatment Effectiveness of Naltrexone" multi-stage randomized trial to illustrate our proposed methods.

Keywords: Cross-fitting; Data-adaptive techniques; Dynamic treatment strategies; Residual confounding.

PubMed Disclaimer

Figures

**Figure 1:**
ExTENd study design. The ® notation represents instances of randomization; the N values in this figure represent the subsequent number of patients assigned to each treatment option.

**Figure 2:**
ExTENd study. Covariate imbalance across different treatment groups. A₁: stage 1 treatment option; A_2NR: stage 2 treatment options among non-responders; A_2R: stage 2 treatment options among responders. The dashed vertical lines show cut points at ± 0.2.

See this image and copyright information in PMC

References

1. AUSTIN PC (2009). Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research. Communications in Statistics-Simulation and Computation 38, 1228–1234.
1. BAI X, TSIATIS AA & O’BRIEN SM (2013). Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling. Biometrics 69, 830–839. - PMC - PubMed
1. BENKESER D, CARONE M, VAN DER LAAN M. & GILBERT P. (2017). Doubly robust nonparametric inference on the average treatment effect. Biometrika 104, 863–880. - PMC - PubMed
1. BERK R, BROWN L, BUJA A, ZHANG K, ZHAO L. et al. (2013). Valid post-selection inference. Annals of Statistics 41, 802–837.
1. Butler EL, Laber EB, Davis SM & Kosorok MR (2018). Incorporating patient preferences into estimation of optimal individualized treatment rules. Biometrics 74, 18–26. - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- figshare - Access datasets and other research materials.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Robust Q-learning

Affiliations

Robust Q-learning

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources