IEEE Access (Jan 2018)
Rested and Restless Bandits With Constrained Arms and Hidden States: Applications in Social Networks and 5G Networks
Abstract
The problem of rested and restless multi-armed bandits with constrained availability (RMAB-CA) of arms is considered. The states of arms evolve in Markovian manner and the exact states are hidden from the decision maker. First, some structural results on value functions are claimed. Following these results, the optimal policy turns out to be a threshold policy. Furthermore, indexability is established for both rested and restless RMAB-CAs. An index formula is derived for the rested model, while an algorithm is provided for restless case.
Keywords