Joint Deep Reinforcement Learning and Unsupervised Learning for Channel Selection and Power Control in D2D Networks

Ming Sun; Yanhui Jin; Shumei Wang; Erzhuang Mei

doi:10.3390/e24121722

Entropy (Nov 2022)

Joint Deep Reinforcement Learning and Unsupervised Learning for Channel Selection and Power Control in D2D Networks

Ming Sun,
Yanhui Jin,
Shumei Wang,
Erzhuang Mei

Affiliations

Ming Sun: College of Computer and Control Engineering, Qiqihar University, Qiqihar 161006, China
Yanhui Jin: College of Computer and Control Engineering, Qiqihar University, Qiqihar 161006, China
Shumei Wang: School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, China
Erzhuang Mei: College of Computer and Control Engineering, Qiqihar University, Qiqihar 161006, China

DOI: https://doi.org/10.3390/e24121722
Journal volume & issue: Vol. 24, no. 12
p. 1722

Abstract

Read online

Device-to-device (D2D) technology enables direct communication between devices, which can effectively solve the problem of insufficient spectrum resources in 5G communication technology. Since the channels are shared among multiple D2D user pairs, it may lead to serious interference between D2D user pairs. In order to reduce interference, effectively increase network capacity, and improve wireless spectrum utilization, this paper proposed a distributed resource allocation algorithm with the joint of a deep Q network (DQN) and an unsupervised learning network. Firstly, a DQN algorithm was constructed to solve the channel allocation in the dynamic and unknown environment in a distributed manner. Then, a deep power control neural network with the unsupervised learning strategy was constructed to output an optimized channel power control scheme to maximize the spectrum transmit sum-rate through the corresponding constraint processing. As opposed to traditional centralized approaches that require the collection of instantaneous global network information, the algorithm proposed in this paper used each transmitter as a learning agent to make channel selection and power control through a small amount of state information collected locally. The simulation results showed that the proposed algorithm was more effective in increasing the convergence speed and maximizing the transmit sum-rate than other traditional centralized and distributed algorithms.

Published in Entropy

ISSN: 1099-4300 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Astronomy: Astrophysics; Science: Physics
Website: http://www.mdpi.com/journal/entropy

About the journal

Abstract

Keywords