Tongxin xuebao (Jun 2022)
New design paradigm for federated edge learning towards 6G:task-oriented resource management strategies
Abstract
Objectives: To make full use of the abundant data distributed at the edge of the network to serve the training of artificial intelligence models, edge intelligence technology represented by federated edge learning emerges as the times require. The rich data at the edge enables it to serve artificial intelligence model training, and design wireless resource management strategies with the goal of optimizing learning performance (such as optimizing model training time, learning convergence, etc.). Methods: Using the federated learning network architecture, this paper analyzes the resource allocation and user scheduling schemes: 1. For the resource allocation problem, the trade-off relationship between the number of communication rounds and the delay per round when the goal of minimizing the total training time is analyzed, To meet the time constraints of each round of training, more frequency bandwidth should be allocated to devices with low computational power, compensating for a computational time by reducing communication time, and vice versa. Therefore, bandwidth allocation between devices should consider both channel conditions and computing resources, which is completely different from the traditional bandwidth allocation work that only considers channel conditions. To this end, it is necessary to model the total training time minimization problem, optimize the quantization series and bandwidth allocation, and design an alternate optimization algorithm to solve this problem. 2. For the user scheduling problem, the communication time minimization optimization problem is modeled by linking the importance of data with the number of communication rounds, channel quality and single-round communication delay, and using a theoretical model to unify the two. By solving the optimization problem, it is found that the optimal scheduling strategy will pay more attention to the importance of data in the early stage, and pay more attention to the channel quality in the later stage. In addition, the proposed single-device scheduling algorithm is also extended to multi-device scheduling scenarios. Results: 1. For the resource allocation problem, when the bandwidth allocation is optimal, the relationship between the total training time and the quantization level obtained by simulation, run the same training process at least 5 times on each quantization level, and there is a total training time. The total training time is T= N ϵ ⋅ T d , and N ϵ is a decreasing function of quantization level q,and Td is an increasing function of q . In addition, the optimal quantization series obtained through theoretical optimization is consistent with the simulation results, and the effectiveness of the proposed algorithm is verified. According to the relationship between the optimal value interval of the loss function and the training time, the optimal quantization series and the optimal bandwidth allocation strategy are obtained by solving the training time minimization problem. 2. For the user scheduling problem, the proposed user scheduling (TLM) scheme is compared with three other common scheduling schemes in the simulation, and the average precision is shown when the communication time is 6 000 s and 14 000 s, where the average Accuracy is obtained by measuring the IoU, the intersection of union, of the predicted value and the true value. The CA scheme yields the worst accuracy on car 1 with the largest channel attenuation, while the IA scheme exhibits the lowest accuracy on car 4 where the data is less important. The ICA scheme aims to find a balance between CA and IA, but due to its heuristic nature, the performance is lower than that of the TLM scheme. Conclusions: 1. The training loss under the optimal quantization level and optimal bandwidth allocation reaches the predetermined threshold in a shorter time and can achieve the highest test accuracy. Secondly, the training performance under the non-optimal quantization level and optimal bandwidth allocation will be better than the performance of the optimal quantization leveland average bandwidth allocation, which also verifies the necessity of resource allocation. 2. The TLM scheme achieves slightly better performance early in training and significantly outperforms all other schemes after full training. This is due to the inherent prospective nature in the proposed TLM protocol which is advantageous over the myopic nature in the existing CA, IA and ICA protocols.