Abstract:
Aiming at the scene combination surge problem of the consumer choice resource(CCR) in multiple stochastic scenarios,this paper uses the deep reinforcement learning algorithm to achieve the optimal selection of CCR groups and the optimal scheduling of the contained nodes. First, according to the constraint conditions and objective function of optimal scheduling for CCR, the mathematical model and the solution complexity of the daily scheduling cycle are analyzed. Then, the optimal scheduling process for CCR is mapped into the situation awareness tuple based on the Markov decision process, and the situation orientation function is established based on the architecture of the dueling deep Q network. Through multiple situation deductions, the situation orientation function is derived by using the small batch gradient descent method, and the algorithm parameters are continuously fed back and updated to realize the decision optimization. Finally, based on the IEEE 33-bus example, by using random number of samples with different sizes, the optimization of the CCR group to be selected is realized in the random operation mode, and the corresponding optimal scheduling strategy is formulated.