李彬贤, 李佳勇, 海征, 万灿, 朱利鹏, 张聪, 李杨. 基于多智能体深度强化学习的配电网三相不平衡在线治理方法[J]. 中国电机工程学报, 2025, 45(5): 1729-1740. DOI: 10.13334/j.0258-8013.pcsee.231908
引用本文: 李彬贤, 李佳勇, 海征, 万灿, 朱利鹏, 张聪, 李杨. 基于多智能体深度强化学习的配电网三相不平衡在线治理方法[J]. 中国电机工程学报, 2025, 45(5): 1729-1740. DOI: 10.13334/j.0258-8013.pcsee.231908
LI Binxian, LI Jiayong, HAI Zheng, WAN Can, ZHU Lipeng, ZHANG Cong, LI Yang. Online Mitigation Method for Three-phase Imbalance in Distribution Network Based on Multi-agent Deep Reinforcement Learning[J]. Proceedings of the CSEE, 2025, 45(5): 1729-1740. DOI: 10.13334/j.0258-8013.pcsee.231908
Citation: LI Binxian, LI Jiayong, HAI Zheng, WAN Can, ZHU Lipeng, ZHANG Cong, LI Yang. Online Mitigation Method for Three-phase Imbalance in Distribution Network Based on Multi-agent Deep Reinforcement Learning[J]. Proceedings of the CSEE, 2025, 45(5): 1729-1740. DOI: 10.13334/j.0258-8013.pcsee.231908

基于多智能体深度强化学习的配电网三相不平衡在线治理方法

Online Mitigation Method for Three-phase Imbalance in Distribution Network Based on Multi-agent Deep Reinforcement Learning

  • 摘要: 随着分布式电源的并网规模不断增加,配电网三相不平衡现象日益突出,对配电网的安全、稳定与经济运行造成了重大威胁。针对此问题,以分布式光伏为控制对象,提出一种基于多智能体深度强化学习的配电网三相不平衡在线治理方法。首先,分析配电网三相不平衡的成因,提出三相不平衡协同治理目标。其次,通过将配电网按照地理位置划分为多个区域,且在各区域设立一区域内光伏动作策略学习智能体,建立配电网三相不平衡多智能体协调治理架构。然后,基于多智能体注意力动作-评价(multi-actor- attention- critic,MAAC)方法,提出智能体动作策略集中训练算法,实现配电网内海量分散光伏系统动作策略的协调优化。最后,将训练好的动作网络部署到各区域,基于区域内实时观测信息在线生成光伏系统动作指令,实现了配电网三相不平衡分布式高效协同治理。利用改进的IEEE 123节点配电系统对所提方法进行仿真分析,通过与其他4种典型方法对比,验证了所提方法在三相不平衡治理方面的有效性与优越性。

     

    Abstract: With the growing integration of distributed energy resources, the phenomenon of three-phase imbalance in the distribution network (DN) becomes increasingly prominent, posing significant threats to the secure, stable and economic operation of DNs. To resolve this issue, this paper proposes an online three-phase imbalance mitigation method using distributed photovoltaic (PV) for DNs based on multi-agent deep reinforcement learning. First, the causes of three-phase imbalance in DNs are analyzed, and the collaborative goals for three-phase imbalance mitigation are proposed. Then, by dividing the DN into multiple regions according to geographical location and assigning a PV action strategy learning agent for each region, a multi-agent coordinated framework for three-phase imbalance mitigation is established. Subsequently, based on the multi-actor-attention-critic (MAAC) method, a centralized training algorithm for agent action strategy is proposed to achieve coordinated optimization of action strategies for a large number of geographically dispersed PV systems. Finally, the well-trained action network is deployed in each region, and the PV system action instructions are generated online based on the real-time regional observation information, realizing the distributed efficient coordinated mitigation of three-phase imbalance in DN. The proposed approach is tested in the modified IEEE 123-bus DN. Through comparison with four other benchmark approaches, the effectiveness and superiority of the proposed method in three-phase imbalances mitigation are verified.

     

/

返回文章
返回