Dianxin kexue (Aug 2023)
Research on constrained policy reinforcement learning based multi-objective optimization of computing power network
Abstract
The computing power network needs to maximize the system performance index on the basis of meeting user business needs, and the existing methods are mainly based on the multi-objective weighting method, which has problems such as difficult to determine hyperparameters and poor cross-scenario applicability.Based on this, based on the analysis of the characteristics of the computing power network target, the user business requirements were taken as the policy constraints, and the performance indicators of the computing power network was taken as the optimization goal based on constrained policy optimization, and the expectation certainty of user business needs and the optimization of system performance through the value-strategy-hyper-parameter multi-level iterative strategy was realized.At the same time, the multi-scale step length (MSL) method for hyper-parameter optimization was studied, which further improved the stability and accuracy of the system.Simulation results show that the proposed method has good convergence and stability under the conditions of single terminal-single edge server, multi-terminal-multi-edge server and system load change.