基于深度强化学习近端策略优化的电网无功优化方法

Reactive Power Optimization Based on Proximal Policy Optimization of Deep Reinforcement Learning

摘要: 新能源和负荷波动给无功优化带来更大的挑战。考虑新能源和负荷时变特性，将无功优化问题构建成强化学习问题。提出了约束–目标划分和目标预设的方法设计奖励函数，并采用近端策略优化算法求解强化学习问题，获得无功优化策略。以改进的IEEE39系统开展案例分析，结果表明所提的奖励函数能提高智能体收敛速度，基于强化学习求解的无功优化策略在决策效果和决策时间上优于传统确定性优化算法。

Abstract: The fluctuations of renewable energies and loads pose a great challenge to reactive power optimization. Considering the time-varying characteristics of new energies and loads, the reactive power optimization problem is constructed as a reinforcement learning problem. The method of constraint-target division and target presupposition is proposed to design a reward function, and the proximal policy optimization algorithm is used to solve the reinforcement learning problem, getting the reactive power optimization policy. A case study is carried out with the modified IEEE39 system, and the results show that the proposed reward function can improve the convergence speed of the agent. The reactive power optimization strategy based on reinforcement learning is supirior to the traditional deterministic optimization algorithm in decision-making effects and decision-making time.