基于可迁移强化学习的断面输电极限计算方法

李康文; 邱高; 刘挺坚; 刘友波; 刘俊勇; 丁理杰

doi:10.13335/j.1000-3673.pst.2023.0548

基于可迁移强化学习的断面输电极限计算方法

Transmission Limit Calculation of Corridors Based on Transferable Reinforcement Learning

摘要

摘要: 断面输电极限是电网安全边界在断面割集的降维投影，其实质是考虑电压无功优化和多类稳定约束的复杂混合整数非凸非线性问题，而新能源的引入进一步扩大了其计算维度，传统方法难以求解。为此，提出一种基于可迁移强化学习的断面输电极限计算方法。首先，考虑暂态功角及电压稳定约束，计及包括电容器组等无功资源，建立含微分代数方程的输电极限混合整数计算模型；然后，将该模型转化为混合整数的马尔科夫决策过程，提出基于混合Categorical分布的近端策略优化求解方法；最后，引入策略分布熵最大化目标，确保智能计算模型在未见运行方式下的迁移能力，实现运行方式或边界条件切换下的输电极限快速分析。IEEE 39节点系统的算例结果表明，相比传统元启发式黑盒优化算法，所提方法在几乎不牺牲精度的前提下效率提升了97.15%。

Abstract: The corridor's transmission limit is defined as the downscaled projection of the grid's security boundary onto the cut set of the corridor, which is essentially a complex mixed integer nonconvex nonlinear problem considering voltage-reactive power optimization and multiple types of stability constraints. Furthermore, the increasing integration of new energies into the grid further expands the corridor's transmission limit computational dimension, making it difficult to solve by using the traditional methods. To this end, a method for calculating the transmission limit of corridors based on the transferable reinforcement learning is proposed. In the first place, a hybrid integer model of transmission limit with the differential-algebraic equations is established, which takes into account the constraints related to the transient power angle, the voltage stability, and the reactive power resources like the capacitor switching. Subsequently, the model is transformed into a Markov decision process with the mixed integers, and a method of proximal policy optimization based on the mixed Categorical distribution is proposed. Ultimately, the policy distribution entropy is introduced to maximize the objective to ensure the transferability of the intelligent computing model in the unseen operating modes, realizing the fast calculation of the transmission limit of the corridors under the implementation of the operating modes or the boundary condition switching. The verification of the IEEE 39-node system shows that compared with the traditional meta-heuristic black-box optimization algorithm, the proposed method improves the calculation efficiency by 97.15% without sacrificing the accuracy.

HTML全文

参考文献(25)

施引文献

资源附件(0)