Scientific Reports (Feb 2024)
Optimizing warfarin dosing for patients with atrial fibrillation using machine learning
Abstract
Abstract While novel oral anticoagulants are increasingly used to reduce risk of stroke in patients with atrial fibrillation, vitamin K antagonists such as warfarin continue to be used extensively for stroke prevention across the world. While effective in reducing the risk of strokes, the complex pharmacodynamics of warfarin make it difficult to use clinically, with many patients experiencing under- and/or over- anticoagulation. In this study we employed a novel implementation of deep reinforcement learning to provide clinical decision support to optimize time in therapeutic International Normalized Ratio (INR) range. We used a novel semi-Markov decision process formulation of the Batch-Constrained deep Q-learning algorithm to develop a reinforcement learning model to dynamically recommend optimal warfarin dosing to achieve INR of 2.0–3.0 for patients with atrial fibrillation. The model was developed using data from 22,502 patients in the warfarin treated groups of the pivotal randomized clinical trials of edoxaban (ENGAGE AF-TIMI 48), apixaban (ARISTOTLE) and rivaroxaban (ROCKET AF). The model was externally validated on data from 5730 warfarin-treated patients in a fourth trial of dabigatran (RE-LY) using multilevel regression models to estimate the relationship between center-level algorithm consistent dosing, time in therapeutic INR range (TTR), and a composite clinical outcome of stroke, systemic embolism or major hemorrhage. External validation showed a positive association between center-level algorithm-consistent dosing and TTR (R2 = 0.56). Each 10% increase in algorithm-consistent dosing at the center level independently predicted a 6.78% improvement in TTR (95% CI 6.29, 7.28; p < 0.001) and a 11% decrease in the composite clinical outcome (HR 0.89; 95% CI 0.81, 1.00; p = 0.015). These results were comparable to those of a rules-based clinical algorithm used for benchmarking, for which each 10% increase in algorithm-consistent dosing independently predicted a 6.10% increase in TTR (95% CI 5.67, 6.54, p < 0.001) and a 10% decrease in the composite outcome (HR 0.90; 95% CI 0.83, 0.98, p = 0.018). Our findings suggest that a deep reinforcement learning algorithm can optimize time in therapeutic range for patients taking warfarin. A digital clinical decision support system to promote algorithm-consistent warfarin dosing could optimize time in therapeutic range and improve clinical outcomes in atrial fibrillation globally.