Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial

Bibhas Chakraborty¹, Palash Ghosh², Erica E M Moodie³, A John Rush⁴

Affiliations

¹ Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore. bibhas.chakraborty@duke-nus.edu.sg.
² Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore.
³ Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.
⁴ Office of Clinical Sciences, Duke-National University of Singapore Medical School, Singapore.

PMID: 26890628
PMCID: PMC4988949
DOI: 10.1111/biom.12493

Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial

Bibhas Chakraborty et al. Biometrics. 2016 Sep.

. 2016 Sep;72(3):865-76.

doi: 10.1111/biom.12493. Epub 2016 Feb 17.

Authors

Bibhas Chakraborty¹, Palash Ghosh², Erica E M Moodie³, A John Rush⁴

Affiliations

¹ Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore. bibhas.chakraborty@duke-nus.edu.sg.
² Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore.
³ Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.
⁴ Office of Clinical Sciences, Duke-National University of Singapore Medical School, Singapore.

PMID: 26890628
PMCID: PMC4988949
DOI: 10.1111/biom.12493

Abstract

A dynamic treatment regimen consists of decision rules that recommend how to individualize treatment to patients based on available treatment and covariate history. In many scientific domains, these decision rules are shared across stages of intervention. As an illustrative example, we discuss STAR*D, a multistage randomized clinical trial for treating major depression. Estimating these shared decision rules often amounts to estimating parameters indexing the decision rules that are shared across stages. In this article, we propose a novel simultaneous estimation procedure for the shared parameters based on Q-learning. We provide an extensive simulation study to illustrate the merit of the proposed method over simple competitors, in terms of the treatment allocation matching of the procedure with the "oracle" procedure, defined as the one that makes treatment recommendations based on the true parameter values as opposed to their estimates. We also look at bias and mean squared error of the individual parameter-estimates as secondary metrics. Finally, we analyze the STAR*D data using the proposed method.

Keywords: Dynamic treatment regimens; Q-learning; STAR*D; Shared parameters.

PubMed Disclaimer

Figures

**Figure 1**
A schematic of the treatment assignment algorithm in the STAR*D study. An “R” within a circle denotes randomization.

**Figure 2**
Convergence patterns of ψ₀, ψ₁, ψ₂, ψ₃ and ψ₄ for the five versions of the Q-shared method (corresponding to five initial values) in the STAR*D study.

**Figure 3**
Confidence planes for the contrast functions and resulting regions of varying recommended optimal treatments in the (QIDS.start, QIDS.slope) plane, for subjects who have experienced low side effect intensity (0) and were treated with combination therapy (−1) at the previous stage, based on the Q-shared method for estimation and m-out-of-n bootstrap for inference.

See this image and copyright information in PMC

References

1. Antos A, Szepesvari C, Munos R. Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning. 2008;71:89–129.
1. Cain L, Robins J, Lanoy E, Logan R, Costagliola D, Hernán M. When to start treatment? A systematic approach to the comparison of dynamic regimes using observational data. The International Journal of Biostatistics. 2010;6 - PMC - PubMed
1. Chakraborty B, Laber E, Zhao Y. Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme. Biometrics. 2013;69:714–723. - PMC - PubMed
1. Chakraborty B, Moodie E. Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine. Springer; New York: 2013.
1. Ernst D, Geurts P, Wehenkel L. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research. 2005;6:503–556.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

N01 MH090003/MH/NIMH NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial

Affiliations

Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials