罗建勋, 张玮, 王辉, 邓立军. 基于深度强化学习的微电网优化调度研究[J]. 电力学报, 2023, 38(1): 54-63. DOI: 10.13357/j.dlxb.2023.006
引用本文: 罗建勋, 张玮, 王辉, 邓立军. 基于深度强化学习的微电网优化调度研究[J]. 电力学报, 2023, 38(1): 54-63. DOI: 10.13357/j.dlxb.2023.006
LUO Jian-xun, ZHANG Wei, WANG Hui, DENG Li-jun. Research on Optimal Scheduling of Micro-Grid Based on Deep Reinforcement Learning[J]. Journal of Electric Power, 2023, 38(1): 54-63. DOI: 10.13357/j.dlxb.2023.006
Citation: LUO Jian-xun, ZHANG Wei, WANG Hui, DENG Li-jun. Research on Optimal Scheduling of Micro-Grid Based on Deep Reinforcement Learning[J]. Journal of Electric Power, 2023, 38(1): 54-63. DOI: 10.13357/j.dlxb.2023.006

基于深度强化学习的微电网优化调度研究

Research on Optimal Scheduling of Micro-Grid Based on Deep Reinforcement Learning

  • 摘要: 针对微电网的调度优化问题,以一个包括风力发电机、储能系统、恒温控制负荷和价格响应负荷的新型微电网模型为研究对象,以实现新型微电网经济运行成本最小为目标,计及风力发电的波动性和随机性对微电网安全经济运行带来的影响,在基于AC算法的框架上,提出了一种改进的A3C算法。通过采用多线程的方法实现异步训练,并将DQN算法的经验回放机制由均匀性采样改进为重要性采样加入到A3C算法的训练中。试验分别对DQN、AC和改进的A3C算法进行了训练仿真并进行了对比。结果表明,改进的A3C算法提高了样本的利用率,训练时间为1.46 min,缩短了训练时间。当风力发电出现波动时,依据主电网的实时电价,使用改进的A3C算法模型通过控制储能装置的充放电,减少了该微电网在高电价时从主电网的购电量,使其在低电价时再大量购电,从而降低了购电成本。该模型给出的调度策略提高了经济效益,有效地降低了风电波动对微电网的影响。所提方案可为微电网智能调度提供参考。

     

    Abstract: In the view of the scheduling optimization problem of microgrid, a novel microgrid model that consists of a wind turbine generator, an energy storage system, a set of thermostatically controlled loads and a set of price-responsive loads is used as the research object. With the goal of achieving the minimum economic operating cost, and considering the impact of the volatility and randomness of wind power generation on the safe and economic operation of the microgrid, based on the framework of AC algorithm, an improved A3C algorithm is proposed. Asynchronous training is realized by adopting multi-threading method. The experience replay mechanism of DQN algorithm is improved from uniform sampling to importance sampling, and is added to the training of A3C algorithm. The DQN, AC and improved A3C algorithms are trained and simulated respectively and compared. Simulation shows that the utilization rate of samples is improved and the training time is reduced by improved A3C algorithm,whose training time is 1.46 minutes. When the wind power fluctuates, the model trained by the improved A3C algorithm controls the charging and discharging of the energy storage device according to the real-time electricity price of the main grid. The scheduling strategy given in the model improves the economic benefits and effectively reduces the impact of wind power fluctuations on the microgrid. The proposed scheme can provide reference for intelligent dispatching of microgrdid.

     

/

返回文章
返回