Journal of King Saud University: Computer and Information Sciences (Sep 2022)

Distributed learning algorithm with synchronized epochs for dynamic spectrum access in unknown environment using multi-user restless multi-armed bandit

  • Himanshu Agrawal,
  • Krishna Asawa

Journal volume & issue
Vol. 34, no. 8
pp. 5435 – 5447

Abstract

Read online

Dynamic spectrum access using cognitive radio has many application areas like smart-grid, Internet of Things, and various other device-to-device communication paradigms. In dynamic spectrum access, a user picks a channel out of N channels to transmit during each time-slot. Thus, the user gets an arbitrary reward from a limited set of reward states, and the selected channel is termed as an active channel. The reward condition of the active channel evolves as per an unknown Markovian chain. In contrast, the reward condition of the passive channels evolves as an arbitrary strange random process. Notably, the objective of a channel selection strategy is to minimize regret by selecting the best channel in terms of mean-availability. A strategy based on consecutive selections (epochs) of channels, dubbed as Adaptive Sequencing of Exploration and Exploitation for Channel Selection in Unknown Environment (ASEE-CSUE), has been proposed. By reasonably planning the sequencing of epochs, ASEE-CSUE can achieve a logarithmic order of regret with time. Furthermore, the extensive simulation results indicate that collisions and switching cost are less than 7% and 2%, respectively, and the selection of the best channels is more than 90% of the total time-slots.

Keywords