Q-Learning Based Joint Energy-Spectral Efficiency Optimization in Multi-Hop Device-to-Device Communication

Muhidul Islam Khan; Luca Reggiani; Muhammad Mahtab Alam; Yannick Le Moullec; Navuday Sharma; Elias Yaacoub; Maurizio Magarini

doi:10.3390/s20226692

Sensors (Nov 2020)

Q-Learning Based Joint Energy-Spectral Efficiency Optimization in Multi-Hop Device-to-Device Communication

Muhidul Islam Khan,
Luca Reggiani,
Muhammad Mahtab Alam,
Yannick Le Moullec,
Navuday Sharma,
Elias Yaacoub,
Maurizio Magarini

Affiliations

Muhidul Islam Khan: Thomas Johann Seebeck Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Luca Reggiani: Dipartimento di Electtronica e Informazione, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
Muhammad Mahtab Alam: Thomas Johann Seebeck Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Yannick Le Moullec: Thomas Johann Seebeck Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Navuday Sharma: Thomas Johann Seebeck Department of Electronics, School of Information Technology, Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia
Elias Yaacoub: Faculty of Computer Studies, Arab Open University, Beirut 2058 4518, Lebanon
Maurizio Magarini: Dipartimento di Electtronica e Informazione, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy

DOI: https://doi.org/10.3390/s20226692
Journal volume & issue: Vol. 20, no. 22
p. 6692

Abstract

Read online

In scenarios, like critical public safety communication networks, On-Scene Available (OSA) user equipment (UE) may be only partially connected with the network infrastructure, e.g., due to physical damages or on-purpose deactivation by the authorities. In this work, we consider multi-hop Device-to-Device (D2D) communication in a hybrid infrastructure where OSA UEs connect to each other in a seamless manner in order to disseminate critical information to a deployed command center. The challenge that we address is to simultaneously keep the OSA UEs alive as long as possible and send the critical information to a final destination (e.g., a command center) as rapidly as possible, while considering the heterogeneous characteristics of the OSA UEs. We propose a dynamic adaptation approach based on machine learning to improve a joint energy-spectral efficiency (ESE). We apply a Q-learning scheme in a hybrid fashion (partially distributed and centralized) in learner agents (distributed OSA UEs) and scheduler agents (remote radio heads or RRHs), for which the next hop selection and RRH selection algorithms are proposed. Our simulation results show that the proposed dynamic adaptation approach outperforms the baseline system by approximately 67% in terms of joint energy-spectral efficiency, wherein the energy efficiency of the OSA UEs benefit from a gain of approximately 30%. Finally, the results show also that our proposed framework with C-RAN reduces latency by approximately 50% w.r.t. the baseline.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords