Applied Sciences (Dec 2020)

Preventive Control Policy Construction in Active Distribution Network of Cyber-Physical System with Reinforcement Learning

  • Pengpeng Sun,
  • Yunwei Dong,
  • Sen Yuan,
  • Chong Wang

DOI
https://doi.org/10.3390/app11010229
Journal volume & issue
Vol. 11, no. 1
p. 229

Abstract

Read online

Once an active distribution network of a cyber-physical system is in alert state, it is vulnerable to cross-domain cascading failures. It is necessary to transit the state of an active distribution network of cyber-physical system from an alert state to a normal state using a preventive control policy against cross-domain cascading failures. In fact, it is difficult to construct and analyze a preventive control policy via theoretical analysis methods or physical experimental methods. The theoretical analysis methods may not be accurate due to approximated models, and the physical experimental methods are expensive and time consuming for building prototypes. This paper presents a preventive control policy construction method based on a deep deterministic policy gradient idea (shorted as PCMD) to generate and optimize a preventive control policy with Artificial Intelligence (AI) technologies. It adopts the reinforcement learning technique to make full use of the available historical data to overcome the problems of high cost and low accuracy. Firstly, a preventive control model is designed based on the finite automaton theory, which can guide the data collection and learning policy selection. The control model considers the voltage stability, frequency stability, current overload prevention, and the control cost reduction as a feedback variable, without the specific power flow equations and differential equations. Then, after enough training, a local optimal preventive control policy can be constructed under the comparability condition among a fitted action-value function and a fitted policy function. The constructed preventive control policy contains some control actions to achieve a low cost and in accord with the principle of shortening a cross-domain cascading failures propagation sequence as far as possible. The PCMD is more flexible and closer to reality than the theoretical analysis methods and has a lower cost than the physical experimental methods. To evaluate the performance of the proposed method, an experimental case study, China Electric Power Research-Cyber-Physical System (shorted as CEPR-CPS), which comes from China Electric Power Research Institute, is carried out. The result shows that the effectiveness of preventive control policy construction with the PCMD is better than most current methods, such as the multi-agent method in terms of reducing the number of failure nodes and avoiding the state space explosion.

Keywords