巨云涛, 陈希. 基于双层多智能体强化学习的微网群分布式有功无功协调优化调度[J]. 中国电机工程学报, 2022, 42(23): 8534-8544. DOI: 10.13334/j.0258-8013.pcsee.212737
引用本文: 巨云涛, 陈希. 基于双层多智能体强化学习的微网群分布式有功无功协调优化调度[J]. 中国电机工程学报, 2022, 42(23): 8534-8544. DOI: 10.13334/j.0258-8013.pcsee.212737
JU Yuntao, CHEN Xi. Distributed Active and Reactive Power Coordinated Optimal Scheduling of Networked Microgrids Based on Two-layer Multi-agent Reinforcement Learning[J]. Proceedings of the CSEE, 2022, 42(23): 8534-8544. DOI: 10.13334/j.0258-8013.pcsee.212737
Citation: JU Yuntao, CHEN Xi. Distributed Active and Reactive Power Coordinated Optimal Scheduling of Networked Microgrids Based on Two-layer Multi-agent Reinforcement Learning[J]. Proceedings of the CSEE, 2022, 42(23): 8534-8544. DOI: 10.13334/j.0258-8013.pcsee.212737

基于双层多智能体强化学习的微网群分布式有功无功协调优化调度

Distributed Active and Reactive Power Coordinated Optimal Scheduling of Networked Microgrids Based on Two-layer Multi-agent Reinforcement Learning

  • 摘要: 为实现微网群的分布式有功无功协调优化调度,提高系统供电可靠性并降低运行成本,该文提出一种双层多智能体强化学习方法训练智能体与环境交互,学习到最优调度策略。该方法不依赖精确的微网群网络模型,而且两层多智能体强化学习算法分别对应训练连续和离散动作智能体组,以适应子微网内同时存在连续、离散动作设备需要控制的问题。此外,考虑到微网群拓扑变化后造成优化任务改变,已有智能体组不适用的情况,给出知识迁移的适用条件,进而采取知识迁移的方法,将已有智能体的经验用于训练新智能体组,避免了从头初始化训练,减少了所需的计算和时间成本。数值实验结果表明,所提方法在微网群的分布式有功无功协调优化调度上具备有效性。

     

    Abstract: In order to realize the distributed active and reactive power coordinated optimal scheduling of networked microgrids, so as to improve the reliability of power supply and reduce the cost, a two-layer multi-agent reinforcement learning method was proposed to train the agents to interact with the environment and learn the optimal scheduling strategy. This method does not rely on an accurate networked microgrids model, and the two-layer multi-agent reinforcement learning algorithm trains continuous and discrete action agents respectively to adapt to the problem that continuous and discrete action devices need to be controlled in each sub-microgrid at the same time. In addition, considering that the topology transformation causes the change of optimization task and the existing agent group is not applicable, the applicable conditions of knowledge transfer were given and the method of knowledge transfer was adopted(the experience of the existing agent was used to train the new agent group), which avoids the ab initio initialization training and reduces the required calculation and time cost. Numerical experimental results show that the proposed method is effective in the distributed active and reactive power coordinated optimal scheduling of networked microgrids.

     

/

返回文章
返回