Stochastic Systems (Jan 2014)
Tuning approximate dynamic programming policies for ambulance redeployment via direct search
Abstract
In this paper we consider approximate dynamic programming methods for ambulance redeployment. We first demonstrate through simple examples how typical value function fitting techniques, such as approximate policy iteration and linear programming, may not be able to locate a high-quality policy even when the value function approximation architecture is rich enough to provide the optimal policy. To make up for this potential shortcoming, we show how to use direct search methods to tune the parameters in a value function approximation architecture so as to obtain high-quality policies. Direct search is computationally intensive. We therefore use a post-decision state dynamic programming formulation of ambulance redeployment that, together with direct search, requires far less computation with no noticeable performance loss. We provide further theoretical support for the post-decision state formulation of the ambulance-deployment problem by showing that this formulation can be obtained through a limiting argument on the original dynamic programming formulation.