徐弘玮, 徐刚, 吴碧琼, 任玉峰. 基于深度强化学习的三峡电站机组负荷分配实时决策方法[J]. 水力发电学报, 2024, 43(8): 76-88.
引用本文: 徐弘玮, 徐刚, 吴碧琼, 任玉峰. 基于深度强化学习的三峡电站机组负荷分配实时决策方法[J]. 水力发电学报, 2024, 43(8): 76-88.
XU Hongwei, XU Gang, WU Biqiong, REN Yufeng. Real-time decision-making method for unit commitment of Three Gorges hydropower station based on deep reinforcement learning[J]. JOURNAL OF HYDROELECTRIC ENGINEERING, 2024, 43(8): 76-88.
Citation: XU Hongwei, XU Gang, WU Biqiong, REN Yufeng. Real-time decision-making method for unit commitment of Three Gorges hydropower station based on deep reinforcement learning[J]. JOURNAL OF HYDROELECTRIC ENGINEERING, 2024, 43(8): 76-88.

基于深度强化学习的三峡电站机组负荷分配实时决策方法

Real-time decision-making method for unit commitment of Three Gorges hydropower station based on deep reinforcement learning

  • 摘要: 本文聚焦于三峡电站厂内经济运行的关键问题——实现以最小化耗水量为目标的大规模机组实时负荷分配。鉴于传统动态规划方法在处理三峡电站大规模水电机组群时面临维数爆炸问题,进而无法满足调度决策实时性要求,本文提出基于深度强化学习的多时段机组负荷分配模型训练和决策框架。采用深度强化学习方法训练深度神经网络,通过预训练网络模型决策生成机组负荷分配计划。将群论应用到深度强化学习的状态和动作特征处理中,显著压缩了状态和动作空间,从而提升模型训练效率。研究结果表明,相比于动态规划法,基于深度强化学习的三峡电站机组负荷分配方法在保证优化解精度的同时,以不到1%的效益损失为代价,将决策耗时降低了2个数量级,为水电站大规模机组负荷分配提供了一种快速、高效的解决方案。

     

    Abstract: This paper focuses on the key issue of the Three Gorges hydropower station’s in-plant economic operation, which is aimed at achieving a real-time load allocation of large-scale units for minimizing water consumption. Dynamic programming usually encounters the curse of dimensionality when dealing with a large-scale hydropower unit cluster, and therefore, it cannot meet the requirement of real-time dispatching decision for the station. For training a multi-period unit load distribution model and its decision-making, we develop a deep reinforcement learning-based framework to train the deep neural network and generates unit load distribution plans through a pre-trained network model. We apply a group theory idea to processing the state and action features of the learning, so as to compress the state and action space significantly and improve model training efficiency. The results indicate that compared to dynamic programming, our new method shortens the decision-making time by two orders of magnitude at a cost of less than 1% benefit loss. Thus, it offers a rapid and efficient solution for the unit load allocations in large-scale hydropower stations.

     

/

返回文章
返回