融合电网运行场景聚类的多任务深度强化学习优化调度

邓柏荣; 陈俊斌; 丁巧宜; 潘振宁; 余涛; 王克英; 侯佳萱

doi:10.13335/j.1000-3673.pst.2022.1140

融合电网运行场景聚类的多任务深度强化学习优化调度

Multi-task Deep Reinforcement Learning Optimal Dispatchng Based on Grid Operation Scenario Clustering

摘要

摘要: “双碳”目标和新型电力系统建设背景下，新能源的高渗透率接入导致电力系统随机性显著增大、运行方式的分布复杂多样，传统单任务深度强化学习难以自适应源荷两侧的高随机性，调度决策难以满足新型电力系统对风光消纳、功率平衡需求。为此，该文提出融合电网运行场景聚类的多任务深度强化学习优化调度方法。该方法离线训练时利用空间聚类和决策树辨识海量调度运行数据的典型运行场景与重要特征，并构建甄别场景类别的多层感知机分类器；再依据场景类别建立和划分融合聚类多任务深度强化学习模型，从数据源到状态动作设计差异化训练各子任务学习器与模型；在线决策时利用分类器辨识有限运行数据的场景类别，调用模型快速求解实时调度任务，实现高随机场景下的多任务快速迁移学习，保证电力系统优化调度决策的最优性。该文通过算例验证了该方法的解的可行性与经济性。实验结果表明，融合电网运行场景聚类的多任务深度强化学习优化调度算法较单任务算法能够明显提升调度决策经济效益。

Abstract: Under the 'Emission Peak and Carbon Neutrality' goal and the new-type power system construction, the high penetration of new energy leads to a significant increase in the randomness and the complex and diverse operation modes of the power system. So, it is difficult for the traditional single-task deep reinforcement learning to adapt to the high randomness on both the sides of the source and the load, and for the dispatching decision to meet the needs of the new power system for the wind and solar energy consumption and the power balance. Therefore, this paper proposes a multi-task deep reinforcement learning optimal dispatching method based on grid operation scenario clustering. During the offline training, this method identifies the typical operation scenarios and their important characteristics of the massive operation data through the spatial clustering and the decision tree. Whar's more, a multi-layer perceptron classifier for discriminating the scenario categories is constructed. These scenario categories are adopted to establish the multi-task model for the multi-task division. Then, the differentiated training task models are designed from the data sources to the state and action spaces. During the online decision-making, we use the classifier to identify the scenario categories of the limited operation data. Then we intelligently choose the model to quickly solve the real-time dispatching tasks, and realize multi-task fast migration learning in the highly random scenarios, which ensures the optimality of the power system optimal dispatching decisions. The experimental results show that the multi-task deep reinforcement learning optimal dispatching algorithm based on grid operation scenario clustering is able to significantly improve the economic benefits of the dispatching decisions compared with the traditional single-task algorithm.

HTML全文

参考文献(32)

施引文献

资源附件(0)