IEEE Open Journal of the Communications Society (Jan 2024)
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration
Abstract
Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.
Keywords