Abstract:
"Dispersed, Disordered and Polluted" enterprises have always been a difficulty in environmental supervision. Aiming at using power data to identify "Dispersed, Disordered and Polluted" enterprises, this paper proposes a strategy of recognizing "scattered and polluted" enterprises based on BSMOTE and CatBoost algorithms. Firstly, with customer profile and electricity consumption data in the data center, multi-dimensional features are built through engineering construction, and the key features are screened. Secondly, the adaptive sampling method based on BSMOTE algorithm is adopted for sample balance processing. Finally, the suspected " scattered and polluted" enterprises in Fujian Province is analyzed by CatBoost algorithm, and the results of the calculation example verify the feasibility and effectiveness of the proposed method in this paper.