Computation Offloading and Resource Allocation Based on P-DQN in LEO Satellite Edge Networks

Xu Yang; Hai Fang; Yuan Gao; Xingjie Wang; Kan Wang; Zheng Liu

doi:10.3390/s23249885

Sensors (Dec 2023)

Computation Offloading and Resource Allocation Based on P-DQN in LEO Satellite Edge Networks

Xu Yang,
Hai Fang,
Yuan Gao,
Xingjie Wang,
Kan Wang,
Zheng Liu

Affiliations

Xu Yang: Xi’an Institute of Space Radio Technology, Xi’an 710100, China
Hai Fang: Xi’an Institute of Space Radio Technology, Xi’an 710100, China
Yuan Gao: Xi’an Institute of Space Radio Technology, Xi’an 710100, China
Xingjie Wang: School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
Kan Wang: School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
Zheng Liu: School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China

DOI: https://doi.org/10.3390/s23249885
Journal volume & issue: Vol. 23, no. 24
p. 9885

Abstract

Read online

Traditional low earth orbit (LEO) satellite networks are typically independent of terrestrial networks, which develop relatively slowly due to the on-board capacity limitation. By integrating emerging mobile edge computing (MEC) with LEO satellite networks to form the business-oriented “end-edge-cloud” multi-level computing architecture, some computing-sensitive tasks can be offloaded by ground terminals to satellites, thereby satisfying more tasks in the network. How to make computation offloading and resource allocation decisions in LEO satellite edge networks, nevertheless, indeed poses challenges in tracking network dynamics and handling sophisticated actions. For the discrete-continuous hybrid action space and time-varying networks, this work aims to use the parameterized deep Q-network (P-DQN) for the joint computation offloading and resource allocation. First, the characteristics of time-varying channels are modeled, and then both communication and computation models under three different offloading decisions are constructed. Second, the constraints on task offloading decisions, on remaining available computing resources, and on the power control of LEO satellites as well as the cloud server are formulated, followed by the maximization problem of satisfied task number over the long run. Third, using the parameterized action Markov decision process (PAMDP) and P-DQN, the joint computing offloading, resource allocation, and power control are made in real time, to accommodate dynamics in LEO satellite edge networks and dispose of the discrete-continuous hybrid action space. Simulation results show that the proposed P-DQN method could approach the optimal control, and outperforms other reinforcement learning (RL) methods for merely either discrete or continuous action space, in terms of the long-term rate of satisfied tasks.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords