Deep reinforcement learning based reactive power regulation and its optimization in power grids

Zhou Yi; Zhou Liangcai; Sheng Xu; Gu Dongjian; Shen Weijian; Chen Qing

doi:10.2478/amns-2024-3041

Applied Mathematics and Nonlinear Sciences (Jan 2024)

Deep reinforcement learning based reactive power regulation and its optimization in power grids

Zhou Yi,
Zhou Liangcai,
Sheng Xu,
Gu Dongjian,
Shen Weijian,
Chen Qing

Affiliations

Zhou Yi: East China Branch of State Grid Corporation, Shanghai, 200002, China.
Zhou Liangcai: East China Branch of State Grid Corporation, Shanghai, 200002, China.
Sheng Xu: NARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, Jiangsu, 211000, China.
Gu Dongjian: NARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, Jiangsu, 211000, China.
Shen Weijian: NARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, Jiangsu, 211000, China.
Chen Qing: NARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, Jiangsu, 211000, China.

DOI: https://doi.org/10.2478/amns-2024-3041
Journal volume & issue: Vol. 9, no. 1

Abstract

Read online

The study applies the Markov game to grid reactive power regulation based on deep reinforcement learning theory, constructs the Markov game for grid optimization problems, and optimizes it using the HAPPO algorithm to explore real-time grid optimization strategy based on multi-intelligence body reinforcement learning. On the basis of the optimization strategy, the grid power management method based on deep reinforcement learning is explored through the Markov decision process and the improved deep deterministic policy gradient method, and the grid operation optimization model based on deep reinforcement learning is constructed. The model is then examined in terms of arithmetic cases. The maximum error of the model in this paper is less than 5%, and the accuracy of the fitting is high. The node voltage has a maximum voltage offset of 0.0025, resulting in high voltage quality. The real-time optimization solves for an average voltage offset that is 97.9% lower and a maximum voltage offset that is 75.4% lower compared to the long-term scale reactive power optimization. The average running cost and standard deviation of the model increase with greater communication impairment. The model approach in this paper performs the best in terms of optimization cost, reducing it by 1.12%, 6.67%, 10.93%, and 0.94% compared to the other four approaches.

Published in Applied Mathematics and Nonlinear Sciences

ISSN: 2444-8656 (Online)
Publisher: Sciendo
Country of publisher: Poland
LCC subjects: Science: Mathematics
Website: https://sciendo.com/journal/AMNS

About the journal

Abstract

Keywords