杨仪馨, 沈文, 郭骞, 陈烨, 单秋红, 宋宇波. 面向电力数据匿名化保护的生成对抗网络[J]. 电力信息与通信技术, 2025, 23(2): 51-58. DOI: 10.16543/j.2095-641x.electric.power.ict.2025.02.07
引用本文: 杨仪馨, 沈文, 郭骞, 陈烨, 单秋红, 宋宇波. 面向电力数据匿名化保护的生成对抗网络[J]. 电力信息与通信技术, 2025, 23(2): 51-58. DOI: 10.16543/j.2095-641x.electric.power.ict.2025.02.07
YANG Yixin, SHEN Wen, GUO Qian, CHEN Ye, SHAN Qiuhong, SONG Yubo. Generative Adversarial Network for Power Data Anonymization Protection[J]. Electric Power Information and Communication Technology, 2025, 23(2): 51-58. DOI: 10.16543/j.2095-641x.electric.power.ict.2025.02.07
Citation: YANG Yixin, SHEN Wen, GUO Qian, CHEN Ye, SHAN Qiuhong, SONG Yubo. Generative Adversarial Network for Power Data Anonymization Protection[J]. Electric Power Information and Communication Technology, 2025, 23(2): 51-58. DOI: 10.16543/j.2095-641x.electric.power.ict.2025.02.07

面向电力数据匿名化保护的生成对抗网络

Generative Adversarial Network for Power Data Anonymization Protection

  • 摘要: 针对在当前电力数据使用环境中,数据的开放性增加、流转频繁且交互对象复杂,导致数据泄露风险在整个数据生命周期中普遍存在的问题,文章提出一种面向电力数据匿名化保护的生成对抗网络。所提方法首先对原始JSON文件进行解析,利用不同的特征编码器进行编码,以有效处理不同类型的变量;其次,通过损失反馈以及Was+GP等机制改进生成对抗网络生成匿名化数据,并通过添加随机噪声来保护数据生成过程中的隐私;相较于现有方法,所提方法能够针对混合数据类型的原始数据生成实用性高且与原始数据相似性强的匿名化数据,实现与原始数据的解耦。实验证明,所提方法合成的数据在机器学习效用和统计相似性方面与原始数据相比有显著降低的差异,从而可以替代原始数据进行挖掘分析和数据共享,有效实现对原始数据的隐私保护。

     

    Abstract: Aiming at the problem that in the current environment of power data usage, the increased openness, frequent flow and complex interaction objects of data lead to the prevalence of data leakage risk throughout the data life cycle, this study proposes a generative adversarial network model for power data anonymization protection. The model first parses the original JSON file and encodes it using different feature encoders to effectively handle different types of variables. Second, the generative adversarial network is improved to generate anonymized data through mechanisms such as loss feedback as well as Was+GP, and privacy is protected during data generation by adding random noise. Compared with the existing methods, the model is able to generate anonymized data with high utility and strong similarity to the original data for the original data of mixed data types, and realize decoupling with the original data. Experiments demonstrate that the data synthesized by the model proposed in this paper has significantly reduced differences in machine learning utility and statistical similarity compared to the original data, thus it can be used to replace the original data for mining analysis and data sharing, and effectively realize the privacy protection of the original data.

     

/

返回文章
返回