物联网学报 (Jun 2024)
DDPG-based performance optimization algorithm for IRS-assisted simultaneous wireless information and power transfer systems
Abstract
For the intelligent reflecting surface (IRS)-assisted multiple input single output (MISO) simultaneous wireless information and power transfer (SWIPT) system, the beam forming vector at the base station and the reflected beam forming vector of the IRS were jointly optimized, by considering the maximum transmit power of the base station, the unit modulus constraint of the IRS reflection phase shift matrix, and the minimum energy constraint of the energy receiver. The object was to maximize the spectrum efficiency. To solve the non-convex optimization problem, a deep deterministic policy gradient (DDPG) algorithm based on deep reinforcement learning was proposed. Simulation results show that the average reward of the DDPG algorithm is related to the learning rate. Under the condition of selecting the appropriate learning rate, the DDPG algorithm can obtain an average mutual information similar to that of the traditional optimization algorithm, but the running time is significantly lower than that of the traditional non-convex optimization algorithm. Even if the number of antennas and the number of reflective units are increased, the DDPG algorithm can still converge in a short period of time. This indicates that the DDPG algorithm can effectively improve the computational efficiency and is suitable for communication services with high real-time requirements.