胡丹尔, 彭勇刚, 韦巍, 肖婷婷, 蔡田田, 习伟. 多时间尺度的配电网深度强化学习无功优化策略[J]. 中国电机工程学报, 2022, 42(14): 5034-5044. DOI: 10.13334/j.0258-8013.pcsee.213110
引用本文: 胡丹尔, 彭勇刚, 韦巍, 肖婷婷, 蔡田田, 习伟. 多时间尺度的配电网深度强化学习无功优化策略[J]. 中国电机工程学报, 2022, 42(14): 5034-5044. DOI: 10.13334/j.0258-8013.pcsee.213110
HU Daner, PENG Yonggang, WEI Wei, XIAO Tingting, CAI Tiantian, XI Wei. Multi-timescale Deep Reinforcement Learning for Reactive Power Optimization of Distribution Network[J]. Proceedings of the CSEE, 2022, 42(14): 5034-5044. DOI: 10.13334/j.0258-8013.pcsee.213110
Citation: HU Daner, PENG Yonggang, WEI Wei, XIAO Tingting, CAI Tiantian, XI Wei. Multi-timescale Deep Reinforcement Learning for Reactive Power Optimization of Distribution Network[J]. Proceedings of the CSEE, 2022, 42(14): 5034-5044. DOI: 10.13334/j.0258-8013.pcsee.213110

多时间尺度的配电网深度强化学习无功优化策略

Multi-timescale Deep Reinforcement Learning for Reactive Power Optimization of Distribution Network

  • 摘要: 随着高比例分布式电源的接入,配电网在应对源荷不确定性和协调多种无功补偿设备等方面面临较大挑战。该文提出一种基于优化数学模型与数据驱动方法相结合的配电网多时间尺度电压调节策略。该策略首先针对长时间尺度调节的有载调压变压器和电容器组,以最小化有功功率损耗为目标,建立基于混合整数二阶锥规划的日前无功电压优化模型。其次,为满足短时间尺度调度对于实时性的要求,提出一种基于多智能体强化学习的日内实时调度方法,将实时无功优化问题转化为马尔科夫博弈过程,并采用集中训练、分散执行框架。与传统方法相比,该方法通信开销低、实时性强并且不依赖于精确的潮流模型。最后,通过IEEE 33节点算例验证所提策略的有效性。

     

    Abstract: With the access of high proportion distributed generation, distributed network faces tremendous challenges in dealing with uncertainty and coordinating a variety of reactive power compensation equipment. This paper presented a multi-timescale voltage regulation strategy based on a mathematical optimization model and data-driven method. First, for the online tap changer and switching capacitor with slow-timescale regulation, Aiming to minimize active power loss, the day ahead reactive power and voltage optimization model were proposed based on mixed-integer second-order cone programming. Second, to meet the real-time requirements on the fast timescale stage, an intraday real-time scheduling method based on multi-agent reinforcement learning was proposed, transforming the real-time reactive power optimization problem into a Markov game process and adopting a centralized training and decentralized execution framework. Compared with traditional methods, this method has low communication cost, better real-time performance, and does not rely on an accurate power flow model. Finally, the effectiveness of the proposed strategy is verified by an IEEE 33-bus example.

     

/

返回文章
返回