Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 24;14(1):4516.
doi: 10.1038/s41598-024-55110-9.

Optimizing warfarin dosing for patients with atrial fibrillation using machine learning

Affiliations

Optimizing warfarin dosing for patients with atrial fibrillation using machine learning

Jeremy Petch et al. Sci Rep. .

Abstract

While novel oral anticoagulants are increasingly used to reduce risk of stroke in patients with atrial fibrillation, vitamin K antagonists such as warfarin continue to be used extensively for stroke prevention across the world. While effective in reducing the risk of strokes, the complex pharmacodynamics of warfarin make it difficult to use clinically, with many patients experiencing under- and/or over- anticoagulation. In this study we employed a novel implementation of deep reinforcement learning to provide clinical decision support to optimize time in therapeutic International Normalized Ratio (INR) range. We used a novel semi-Markov decision process formulation of the Batch-Constrained deep Q-learning algorithm to develop a reinforcement learning model to dynamically recommend optimal warfarin dosing to achieve INR of 2.0-3.0 for patients with atrial fibrillation. The model was developed using data from 22,502 patients in the warfarin treated groups of the pivotal randomized clinical trials of edoxaban (ENGAGE AF-TIMI 48), apixaban (ARISTOTLE) and rivaroxaban (ROCKET AF). The model was externally validated on data from 5730 warfarin-treated patients in a fourth trial of dabigatran (RE-LY) using multilevel regression models to estimate the relationship between center-level algorithm consistent dosing, time in therapeutic INR range (TTR), and a composite clinical outcome of stroke, systemic embolism or major hemorrhage. External validation showed a positive association between center-level algorithm-consistent dosing and TTR (R2 = 0.56). Each 10% increase in algorithm-consistent dosing at the center level independently predicted a 6.78% improvement in TTR (95% CI 6.29, 7.28; p < 0.001) and a 11% decrease in the composite clinical outcome (HR 0.89; 95% CI 0.81, 1.00; p = 0.015). These results were comparable to those of a rules-based clinical algorithm used for benchmarking, for which each 10% increase in algorithm-consistent dosing independently predicted a 6.10% increase in TTR (95% CI 5.67, 6.54, p < 0.001) and a 10% decrease in the composite outcome (HR 0.90; 95% CI 0.83, 0.98, p = 0.018). Our findings suggest that a deep reinforcement learning algorithm can optimize time in therapeutic range for patients taking warfarin. A digital clinical decision support system to promote algorithm-consistent warfarin dosing could optimize time in therapeutic range and improve clinical outcomes in atrial fibrillation globally.

PubMed Disclaimer

Conflict of interest statement

Dr. Petch reports research support from Roche Canada. Mr. Nelson has nothing to disclose. Ms. Wu has nothing to disclose. Ms. Di has nothing to disclose. Dr. Carnicelli is supported by grant funding from the National Institutes of Health (5T32HL069749-17). Dr. Ghassemi has nothing to disclose. Dr. Benz has nothing to disclose. Dr. Fatemi is an employee of Microsoft Research. Dr. Granger reports research grants/contracts from AKROS, Apple, AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Daiichi Sankyo, Duke Clinical Research Institute, US Food and Drug Administration, Glaxosmithkline, Janssen Pharmaceutica, Medtronic Foundation, Novartis, and Pfizer; Consulting from Abbvie, Bayer, Boehringer Ingelheim, Boston Scientific, Bristol Myers Squibb, CeleCor Therapeutics, Correvio, Espero BioPharma, Janssen, Medscape, Medtronic LLC, Medtronic Inc, Merck, National Institute of Health, Novo Nordisk, Novartis, Pfizer, Rhoshan Pharmaceuticals, and Roche Diagnostics. Dr. Giugliano reports research support from Amgen and Anthos Therapeutics; Honoraria for CME lectures from Amgen, Daiichi Sankyo, and Servier; Consultant fees from Amarin, American College of Cardiology, Amgen, Astra Zeneca, CryoLife, CVS Caremark, Daiichi Sankyo, Esperion, Gilead, Glaxosmithkline, SAJA Pharmaceuticals, Samsung, and Servier. Dr. Hong has nothing to disclose. Dr. Patel reports receiving research grants/contracts from AstraZeneca, Bayer, and Janssen Research and Development; Funding through Duke for educational activities from AstraZeneca and Janssen Research and Development; Consulting from AstraZeneca, Bayer, and the Thrombosis Research Institute. Dr. Wallentin reports grants from AstraZeneca, Bristol-Myers Squibb/Pfizer, GlaxoSmithKline, Merck & Co, Boehringer Ingelheim, and Roche Diagnostics; Personal fees from Abbott. Dr. Wallentin has a patent (EP2047275B1) licensed to Roche Diagnostics and a patent (US8951742B2) licensed to Roche Diagnostics. Dr. Eikelboom reports consulting/honoraria support from Astra-Zeneca, Bayer, Boehringer-Ingelheim, Bristol-Myer-Squibb, Daiichi-Sankyo, Eli-Lilly, Glaxo-Smith-Kline, Pfizer, Janssen, Sanofi-Aventis, Servier and grant support from Astra-Zeneca, Bayer, Boehringer-Ingelheim, Bristol-Myer-Squibb, Glaxo-Smith-Kline, Pfizer, Janssen, Sanofi-Aventis. Dr. Connolly reports research support and honoraria for consulting and lectures from Portola, BMS, Pfizer, Javelin, Boehringer Ingelheim, Bayer, Daiichi Sankyo, and Abbott.

Figures

Figure 1
Figure 1
Warfarin dosing optimization as a sequential reinforcement learning task using semi-Markov Decision Processes. The figure illustrates the first several time intervals of a patient trajectory. The highlighted values in the top part of the figure illustrate a single patient trajectory. Each observed time step (T = 0, T = 1, etc.) has a corresponding International Normalized Ratio (INR) test result. The action space is illustrated in purple and is defined as percent changes in warfarin dose—the arrows illustrate that the effect of the change in dose is observed at the following time step. The dynamic elements of the state space are INR test results. We employ a semi-Markov Decision Process framework to handle the inconsistent time intervals between observations inherent to warfarin management, illustrated in the bottom part of the figure. Linear interpolation (Rosendaal’s method) is used to generate INR values for intermediate unobserved time steps. We then calculate intermediate rewards based on interpolated INR values (rewarded when INR is in the range of 2–3) and apply a cumulative discounted rewards to each observed time step. The cumulative discounted reward for an observed time step is applied to the action taken at the previous observed time step (e.g., the dosing action observed at T = 1 is rewarded based on the INR value observed at T = 2). We do not model warfarin initiation, so there is no reward function at T = 0. Observed time step are displayed in darker shades; unobserved interpolated time steps are displayed in lighter shades.
Figure 2
Figure 2
Comparison of clinician, reinforcement learning (RL), and benchmark policies. The figure illustrates clinician, benchmark algorithm and RL policies using heatmaps, where the X axis illustrates INR result bins, and the Y axis illustrates the distribution of dose recommendations within each INR result bin. On average, the RL policy recommends larger dose changes in the context of very high and very low INR values than either the clinician or benchmark policies.
Figure 3
Figure 3
Weighted linear regression of the association between mean center algorithm-consistency and mean center time in therapeutic range (TTR). Mean center algorithm-consistent dosing and TTR were calculated by averaging the values obtained from patients in each center. The regression model was weighted by the number of patients per center. Each data point represents a single center, and the size of the data point represents the number of patients in that center.

Similar articles

Cited by

References

    1. World Health Statistics 2021: Monitoring Health for the SDGs, Sustainable Development Goals. (World Health Organization, Geneva, 2021).
    1. Wolf PA, Abbott RD, Kannel WB. Atrial fibrillation as an independent risk factor for stroke: The framingham study. Stroke. 1991;22:983–988. doi: 10.1161/01.STR.22.8.983. - DOI - PubMed
    1. Lippi G, Mattiuzzi C, Cervellin G, Favaloro EJ. Direct oral anticoagulants: Analysis of worldwide use and popularity using Google Trends. Ann. Transl. Med. 2017;5:322. doi: 10.21037/atm.2017.06.65. - DOI - PMC - PubMed
    1. Pirmohamed M. Warfarin: The end or the end of one size fits all therapy? J. Pers. Med. 2018;8:22. doi: 10.3390/jpm8030022. - DOI - PMC - PubMed
    1. Wadelius M, Pirmohamed M. Pharmacogenetics of warfarin: Current status and future challenges. Pharmacogenom. J. 2007;7:99–111. doi: 10.1038/sj.tpj.6500417. - DOI - PubMed