宋伟业, 刘灵玥, 阎洁, 王航宇, 何书凯, 韩爽, 王明辉, 刘永前. 基于深度强化学习的海上风电集群自进化功率平滑控制方法[J]. 中国电力, 2023, 56(3): 36-46. DOI: 10.11930/j.issn.1004-9649.202206099
引用本文: 宋伟业, 刘灵玥, 阎洁, 王航宇, 何书凯, 韩爽, 王明辉, 刘永前. 基于深度强化学习的海上风电集群自进化功率平滑控制方法[J]. 中国电力, 2023, 56(3): 36-46. DOI: 10.11930/j.issn.1004-9649.202206099
SONG Weiye, LIU Lingyue, YAN Jie, WANG Hangyu, HE Shukai, HAN Shuang, WANG Minghui, LIU Yongqian. Self-evolving Power Smooth Control Method for Offshore Wind Power Cluster Based On Deep Reinforcement Learning[J]. Electric Power, 2023, 56(3): 36-46. DOI: 10.11930/j.issn.1004-9649.202206099
Citation: SONG Weiye, LIU Lingyue, YAN Jie, WANG Hangyu, HE Shukai, HAN Shuang, WANG Minghui, LIU Yongqian. Self-evolving Power Smooth Control Method for Offshore Wind Power Cluster Based On Deep Reinforcement Learning[J]. Electric Power, 2023, 56(3): 36-46. DOI: 10.11930/j.issn.1004-9649.202206099

基于深度强化学习的海上风电集群自进化功率平滑控制方法

Self-evolving Power Smooth Control Method for Offshore Wind Power Cluster Based On Deep Reinforcement Learning

  • 摘要: 海上风电集群的风速时空相关性强,加剧了整体有功输出的波动幅度,大规模并网对电力系统的影响更为突出,海上风电集群有功输出平滑控制是解决上述问题的关键手段。传统方法的优化效率低、难以支撑高频率控制,且对预测误差、执行偏差灵敏度过高。因此,提出了“策略离线训练、在线快速寻优、控制效果自进化”的控制架构,建立了深度强化学习的海上风电集群有功输出平滑控制模型。首先,提出了面向集群功率平滑控制的短期收益函数,基于马尔科夫决策过程模型求解最优指令;其次,提出了面向功率策略校准的长期收益Policy函数,根据历史反馈数据有效矫正控制偏差;最后,建立了智能体状态、控制收益和控制决策之间映射的深度神经网络模型,实现基于深度确定性策略梯度算法的智能体训练与求解。算例结果表明:在平均风速为7.5m/s的给定风况下,所提方法能够降低功率波动幅度达20%,同时将发电量损失控制在5%以内。

     

    Abstract: The offshore wind speed has a high spatial-temporal correlation, which aggravates the power fluctuation of the whole wind farm and poses significant challenges to the power system, especially when large-scale offshore wind power is integrated. Smoothing control of large-scale offshore wind power clusters is a key solution to mitigating the above problems. However, most existing methods are inefficient and difficult to support higher frequency control and are susceptible to wind power forecast errors and the deviation of actual action from the optimal control command. Therefore, this paper proposes a new control framework for “offline-training, online-optimization and self-evolution”, and establishes a deep-reinforcement-learning-based model for the smoothing control of the active power of offshore wind power clusters. Firstly, a short-term revenue function for cluster power smoothing control is proposed to solve the optimal command based on the Markov decision process model. Secondly, a long-term revenue policy function for power policy calibration is proposed to effectively correct the control deviation according to the historical feedback data. Finally, a deep neural network model is established for the mapping between the state of the agent, the control benefit and the control decision to realize the training and solution of the agent based on the deep deterministic policy gradient algorithm. The results show that the proposed method can reduce the power fluctuation by 20% and control the power loss within 5% under the given wind condition of 7.5 m/s average wind speed.

     

/

返回文章
返回