宋宇航, 陈宇帆, 魏延岭, 高山. 基于强化学习环境设计策略的电动汽车充电路径规划[J]. 电力系统自动化, 2024, 48(11): 184-196.
引用本文: 宋宇航, 陈宇帆, 魏延岭, 高山. 基于强化学习环境设计策略的电动汽车充电路径规划[J]. 电力系统自动化, 2024, 48(11): 184-196.
SONG Yuhang, CHEN Yufan, WEI Yanling, GAO Shan. Charging Path Planning for Electric Vehicles Based on Reinforcement Learning Environment Design Strategy[J]. Automation of Electric Power Systems, 2024, 48(11): 184-196.
Citation: SONG Yuhang, CHEN Yufan, WEI Yanling, GAO Shan. Charging Path Planning for Electric Vehicles Based on Reinforcement Learning Environment Design Strategy[J]. Automation of Electric Power Systems, 2024, 48(11): 184-196.

基于强化学习环境设计策略的电动汽车充电路径规划

Charging Path Planning for Electric Vehicles Based on Reinforcement Learning Environment Design Strategy

  • 摘要: 针对电动汽车充电路径规划问题,提出了一种适用于强化学习的环境建模方法。该方法基于城市道路网格与充电站地理位置分布等现实情况,将电动汽车的基本行驶路径分为三段进行表达。在三段式表达方法的基础上,提出了状态空间、动作空间、状态转移与奖励函数的设计方案,将充电路径规划建模为马尔可夫决策过程,并利用Q学习方法与深度Q网络(DQN)方法求解。实验结果表明,基于三段式表达法的强化学习环境设计方案具有可解性与可迁移性,考虑了电动汽车从道路驶向充电站过程中的降速转弯等现实场景,同时将充电动作简化为一种行驶方向选择,提升了基于Q学习与DQN的强化学习算法效率。

     

    Abstract: An environmental modeling method suitable for reinforcement learning is proposed for the charging path planning problem of electric vehicles. Based on the actual situation of urban road network and geographical distribution of charging stations,this method divides the basic driving path of electric vehicles into three segments for representation. Based on the three-segment expression method, the design scheme of state space, action space, state transition, and reward function is proposed. The charging path planning is modeled as a Markov decision process, and solved by the Q learning method and the deep Q network(DQN)method. The experimental results show that the design scheme of the reinforcement learning environment based on the threesegment expression method is solvable and portable. It takes into account the realistic scenarios such as the deceleration and turning of electric vehicles in the process of driving from the road to the charging station, and simplifies the charging action into a driving direction choice, which improves the efficiency of the reinforcement learning algorithm based on Q learning and DQN.

     

/

返回文章
返回