Abstract:
The fluctuations of renewable energies and loads pose a great challenge to reactive power optimization. Considering the time-varying characteristics of new energies and loads, the reactive power optimization problem is constructed as a reinforcement learning problem. The method of constraint-target division and target presupposition is proposed to design a reward function, and the proximal policy optimization algorithm is used to solve the reinforcement learning problem, getting the reactive power optimization policy. A case study is carried out with the modified IEEE39 system, and the results show that the proposed reward function can improve the convergence speed of the agent. The reactive power optimization strategy based on reinforcement learning is supirior to the traditional deterministic optimization algorithm in decision-making effects and decision-making time.