Abstract:
Aiming at the problem that in the current environment of power data usage, the increased openness, frequent flow and complex interaction objects of data lead to the prevalence of data leakage risk throughout the data life cycle, this study proposes a generative adversarial network model for power data anonymization protection. The model first parses the original JSON file and encodes it using different feature encoders to effectively handle different types of variables. Second, the generative adversarial network is improved to generate anonymized data through mechanisms such as loss feedback as well as Was+GP, and privacy is protected during data generation by adding random noise. Compared with the existing methods, the model is able to generate anonymized data with high utility and strong similarity to the original data for the original data of mixed data types, and realize decoupling with the original data. Experiments demonstrate that the data synthesized by the model proposed in this paper has significantly reduced differences in machine learning utility and statistical similarity compared to the original data, thus it can be used to replace the original data for mining analysis and data sharing, and effectively realize the privacy protection of the original data.