Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 1;25(4):933-946.
doi: 10.1093/biostatistics/kxae002.

Estimation of optimal treatment regimes with electronic medical record data using the residual life value estimator

Affiliations

Estimation of optimal treatment regimes with electronic medical record data using the residual life value estimator

Grace Rhodes et al. Biostatistics. .

Erratum in

  • Correction.
    [No authors listed] [No authors listed] Biostatistics. 2024 Dec 31;26(1):kxae029. doi: 10.1093/biostatistics/kxae029. Biostatistics. 2024. PMID: 39186534 Free PMC article. No abstract available.

Abstract

Clinicians and patients must make treatment decisions at a series of key decision points throughout disease progression. A dynamic treatment regime is a set of sequential decision rules that return treatment decisions based on accumulating patient information, like that commonly found in electronic medical record (EMR) data. When applied to a patient population, an optimal treatment regime leads to the most favorable outcome on average. Identifying optimal treatment regimes that maximize residual life is especially desirable for patients with life-threatening diseases such as sepsis, a complex medical condition that involves severe infections with organ dysfunction. We introduce the residual life value estimator (ReLiVE), an estimator for the expected value of cumulative restricted residual life under a fixed treatment regime. Building on ReLiVE, we present a method for estimating an optimal treatment regime that maximizes expected cumulative restricted residual life. Our proposed method, ReLiVE-Q, conducts estimation via the backward induction algorithm Q-learning. We illustrate the utility of ReLiVE-Q in simulation studies, and we apply ReLiVE-Q to estimate an optimal treatment regime for septic patients in the intensive care unit using EMR data from the Multiparameter Intelligent Monitoring Intensive Care database. Ultimately, we demonstrate that ReLiVE-Q leverages accumulating patient information to estimate personalized treatment regimes that optimize a clinically meaningful function of residual life.

Keywords: MIMIC-III; Q-learning; context vector; dynamic treatment regime; electronic medical record; precision medicine; random forest; residual life; sepsis.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Fig. 1
Fig. 1
AFT simulation study: boxplots of the value estimates from the validation procedure (left) and the testing procedure (right) for an optimal treatment regime (Opt), the observed treatment regime (Obs), and the no treatment regime (No). For scenarios dependent on the Q-models, value estimates are presented using the baseline vector (B), the average vector (A), the last-value carried forward vector (L), and the context vector (C).
Fig. 2
Fig. 2
Cox simulation study: boxplots of the value estimates from the validation procedure (left) and the testing procedure (right) for an optimal treatment regime (Opt), the observed treatment regime (Obs), and the no treatment regime (No). For scenarios dependent on the Q-models, value estimates are presented using the baseline vector (B), the average vector (A), the last-value carried forward vector (L), and the context vector (C).
Fig. 3
Fig. 3
MIMIC-III application: boxplots of the value estimates from the testing procedure for an optimal treatment regime (Opt), the observed treatment regime (Obs), and the no treatment regime (No). Value estimates are presented from Q-models using the baseline vector (B), the average vector (A), the last-value carried forward vector (L), and the context vector (C).

References

    1. Bai X., Tsiatis A. A., Lu W. and Song R. (2017). Optimal treatment regimes for survival endpoints using a locally-efficient doubly-robust estimator from a classification perspective. Lifetime Data Anal 23(4), 585–604. - PMC - PubMed
    1. Bellman R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.
    1. Breiman L. (2001). Random forests. Mach Learn 45, 5–32.
    1. Cho H., Holloway S. T., Couper D. J. and Kosorok M. R. (2022). Multi-stage optimal dynamic treatment regimes for survival outcomes with dependent censoring. Biometrika 00(0), 1–16. - PMC - PubMed
    1. Choi T., Lee H. and Choi S. (2023). Accountable survival contrast-learning for optimal dynamic treatment regimes. Sci Rep. 13, 2250. - PMC - PubMed