张超, 赵冬梅, 季宇, 张颖. 基于改进深度Q网络的虚拟电厂实时优化调度[J]. 中国电力, 2024, 57(1): 91-100. DOI: 10.11930/j.issn.1004-9649.202307006
引用本文: 张超, 赵冬梅, 季宇, 张颖. 基于改进深度Q网络的虚拟电厂实时优化调度[J]. 中国电力, 2024, 57(1): 91-100. DOI: 10.11930/j.issn.1004-9649.202307006
ZHANG Chao, ZHAO Dongmei, JI Yu, ZHANG Ying. Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network[J]. Electric Power, 2024, 57(1): 91-100. DOI: 10.11930/j.issn.1004-9649.202307006
Citation: ZHANG Chao, ZHAO Dongmei, JI Yu, ZHANG Ying. Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network[J]. Electric Power, 2024, 57(1): 91-100. DOI: 10.11930/j.issn.1004-9649.202307006

基于改进深度Q网络的虚拟电厂实时优化调度

Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network

  • 摘要: 深度强化学习算法以数据为驱动,且不依赖具体模型,能有效应对虚拟电厂运营中的复杂性问题。然而,现有算法难以严格执行操作约束,在实际系统中的应用受到限制。为了克服这一问题,提出了一种基于深度强化学习的改进深度Q网络(improved deep Q-network,MDQN)算法。该算法将深度神经网络表达为混合整数规划公式,以确保在动作空间内严格执行所有操作约束,从而保证了所制定的调度在实际运行中的可行性。此外,还进行了敏感性分析,以灵活地调整超参数,为算法的优化提供了更大的灵活性。最后,通过对比实验验证了MDQN算法的优越性能。该算法为应对虚拟电厂运营中的复杂性问题提供了一种有效的解决方案。

     

    Abstract: The deep reinforcement learning algorithm is data-driven and does not rely on specific models, which can effectively address the complexity issues in virtual power plant (VPP) operation. However, existing algorithms are difficult to strictly enforce operational constraints, which limits their application in practical systems. To overcome this problem, an improved deep Q-network (MDQN) algorithm based on deep reinforcement learning is proposed. This algorithm expresses deep neural networks as mixed integer programming formulas to ensure strict execution of all operational constraints within the action space, thus ensuring the feasibility of the formulated scheduling in actual operation. In addition, sensitivity analysis is conducted to flexibly adjust hyperparameters, providing greater flexibility for algorithm optimization. Finally, the superior performance of the MDQN algorithm is verified through comparative experiments. An effective solution is provided to address the complexity issues in the operation of VPP.

     

/

返回文章
返回