Throughput-Aware Cooperative Reinforcement Learning for Adaptive Resource Allocation in Device-to-Device Communication

Muhidul Islam Khan; Muhammad Mahtab Alam; Yannick Le Moullec; Elias Yaacoub

doi:10.3390/fi9040072

Future Internet (Nov 2017)

Throughput-Aware Cooperative Reinforcement Learning for Adaptive Resource Allocation in Device-to-Device Communication

Muhidul Islam Khan,
Muhammad Mahtab Alam,
Yannick Le Moullec,
Elias Yaacoub

Affiliations

Muhidul Islam Khan: Thomas Johann Seeback Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Muhammad Mahtab Alam: Thomas Johann Seeback Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Yannick Le Moullec: Thomas Johann Seeback Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Elias Yaacoub: Faculty of Computer Studies, Arab Open University, Omar Bayhoum Str. - Park Sector, Beirut 2058 4518, Lebanon

DOI: https://doi.org/10.3390/fi9040072
Journal volume & issue: Vol. 9, no. 4
p. 72

Abstract

Read online

Device-to-device (D2D) communication is an essential feature for the future cellular networks as it increases spectrum efficiency by reusing resources between cellular and D2D users. However, the performance of the overall system can degrade if there is no proper control over interferences produced by the D2D users. Efficient resource allocation among D2D User equipments (UE) in a cellular network is desirable since it helps to provide a suitable interference management system. In this paper, we propose a cooperative reinforcement learning algorithm for adaptive resource allocation, which contributes to improving system throughput. In order to avoid selfish devices, which try to increase the throughput independently, we consider cooperation between devices as promising approach to significantly improve the overall system throughput. We impose cooperation by sharing the value function/learned policies between devices and incorporating a neighboring factor. We incorporate the set of states with the appropriate number of system-defined variables, which increases the observation space and consequently improves the accuracy of the learning algorithm. Finally, we compare our work with existing distributed reinforcement learning and random allocation of resources. Simulation results show that the proposed resource allocation algorithm outperforms both existing methods while varying the number of D2D users and transmission power in terms of overall system throughput, as well as D2D throughput by proper Resource block (RB)-power level combination with fairness measure and improving the Quality of service (QoS) by efficient controlling of the interference level.

Published in Future Internet

ISSN: 1999-5903 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/futureinternet/

About the journal

Abstract

Keywords