Abstract:
The organic Rankine cycle is the dominating internal combustion engine waste heat recovery technology due to its high thermal efficiency and mature components. However, due to the complexity and variability of actual road conditions, the safety and efficient control of waste heat recovery systems under transient conditions faces great challenges. To solve the above issues, a deep reinforcement learning-based control method is proposed with offline optimal control leaning and online decision. An experimentally validated dynamic simulation model of a transcritical organic Rankine cycle is developed as the training environment to learn a safe and optimized control strategy. The simulation results show that the deep reinforcement learning control can achieve safer and more efficient control than a conventional thermostatic PID control (controlling a constant working fluid temperature at the expander inlet). The deep reinforcement learning-based control also shows perfect extrapolation generalization performance under untrained transient fluctuating heat sources. The results of this paper demonstrate a very promising potential of deep reinforcement learning for the optimal control of thermodynamic cycles.