周翔, 陈盛, 张津源, 袁鑫, 王新迎, 王继业. 基于改进深度确定性策略梯度算法的微网优化调度研究[J]. 电力信息与通信技术, 2022, 20(7): 65-74. DOI: 10.16543/j.2095-641x.electric.power.ict.2022.07.009
引用本文: 周翔, 陈盛, 张津源, 袁鑫, 王新迎, 王继业. 基于改进深度确定性策略梯度算法的微网优化调度研究[J]. 电力信息与通信技术, 2022, 20(7): 65-74. DOI: 10.16543/j.2095-641x.electric.power.ict.2022.07.009
ZHOU Xiang, CHEN Sheng, ZHANG Jinyuan, YUAN Xin, WANG Xinying, WANG Jiye. Research on Optimal Dispatch of Microgrid Based on Improved Deep Deterministic Policy Gradient[J]. Electric Power Information and Communication Technology, 2022, 20(7): 65-74. DOI: 10.16543/j.2095-641x.electric.power.ict.2022.07.009
Citation: ZHOU Xiang, CHEN Sheng, ZHANG Jinyuan, YUAN Xin, WANG Xinying, WANG Jiye. Research on Optimal Dispatch of Microgrid Based on Improved Deep Deterministic Policy Gradient[J]. Electric Power Information and Communication Technology, 2022, 20(7): 65-74. DOI: 10.16543/j.2095-641x.electric.power.ict.2022.07.009

基于改进深度确定性策略梯度算法的微网优化调度研究

Research on Optimal Dispatch of Microgrid Based on Improved Deep Deterministic Policy Gradient

  • 摘要: 微网作为能源互联网的重要组成部分,对于风、光等新能源的就地消纳具有重要意义。但分布式风光出力的间歇性、波动性及负荷侧用电需求的随机性给微网的优化调度带来巨大挑战。针对微网中分布式新能源出力与用户用电的不确定性问题,文章采用基于分类经验回放机制的深度确定性策略梯度算法,通过数据驱动方式自适应风光、负荷的不确定性,求解微网优化调度问题,在考虑分时电价及弃风弃光惩罚的基础上,设计以最小化运行成本和最大程度消纳新能源的奖励机制,基于即时奖励值大小的经验池分类,提高模型的训练速度和收敛性能。最后,通过IEEE14节点算例进行仿真验证,验证结果表明,所提方法可实时生成优化调度策略,不需对风光出力以及负荷进行精确建模,同时调度经济成本相较于深度Q学习网络算法降低4.73%。

     

    Abstract: As an important part of the Energy Internet, microgrid is of great significance for the local consumption of distributed wind and solar energy. However, the randomness of new energy output and the temporal and spatial volatility of load have brought huge challenges to the optimal dispatch of microgrids. In response to this problem, this paper uses a deep deterministic strategy gradient model based on the classification experience replay mechanism, adapts to the uncertainty of wind and load through data-driven methods to realize the optimal dispatch of the microgrid. Based on the consideration of the time-of-use electricity price and the penalty of abandoning wind turbine and photovoltaic output, the design of reward mechanism is to minimize operating costs and accommodate new energy to the greatest extent, and the classification of the experience pool based on the instant reward value improves the training speed and convergence performance of the model. Finally, the IEEE14-node case is used for simulation verification. The results show that the DDPG model in this paper can generate an optimal dispatch strategy in real time, it does not require accurate modeling of wind turbine and photovoltaic output and load. Meanwhile, compared with the DQN, the cost of dispatch is reduced by 4.73%.

     

/

返回文章
返回