基于SNN-密度峰值聚类算法的商业用户典型负荷模式提取

王俊; 肖辉; 王家奇; 龙飞宇

doi:10.13357/j.dlxb.2023.007

基于SNN-密度峰值聚类算法的商业用户典型负荷模式提取

Extracting Typical Load Patterns of Commercial Users Based on SNN-Density Peak Clustering Algorithm

摘要

摘要: 对商业用户典型负荷进行精确、迅速、高效地提取及分类，是电网公司摸清商业用户用电行为和需求规律不可或缺的重要工作。大数据背景下传统聚类算法用于高维汇集、类簇结果差别大的商业用户负荷曲线时，存在截断距离选取困难、聚类效果不够清晰、负荷模式提取效率低等问题，为此，提出一个改善局部密度测量和聚类中心点选取的算法。首先，将数据预处理，剔除掉完整程度较低的负荷曲线；接着，运用PCA分析方法降低处理后的商业用户负荷曲线维度，并在构建样本点共享邻域集合的基础上利用改进SNN-DPC算法计算出距离矩阵，代替原算法的距离矩阵作为输入数据；然后在重新定义SNN相似度、样本局部密度ρ和距离最大密度点距离δ的算法计算基础上，利用拐点确认聚类中心，并完成对抽样曲线的聚类分析。总之，改进算法通过样本点之间的共享近邻定义样本的相似性，精准分析了一些多维异构的负荷数据，通过拐点实现了真实聚类中心点的确定，解决了主观意志择取聚类中心的问题，从而大幅度提升负荷聚类效果。算例结果表明：1）对于商业用户实测负荷数据集，所提算法能够更加准确选择聚类中心，运行效率高。2）相对于传统的算法，基于该改进算法所提出负荷模式识别模型可以更好地帮助电网公司分析用户的用电特性，验证了该模型针对不同商业用户典型负荷模式可以进行更加精确地识别。综上，所提策略在现实商业用户场景下存在效能优势。

Abstract: Accurate, rapid and efficient extraction and classification of typical loads of commercial users is an indispensable and important work for power grid companies to find out the electricity consumption behavior and demand rules of commercial users. In the context of big data, when the traditional clustering algorithm is used for commercial user load curves with high-dimensional collection and large difference in cluster results, there are some problems, such as difficulty in selecting truncation distance, unclear clustering effect and low efficiency of load pattern extraction. Therefore, an algorithm to improve local density measurement and selection of cluster center points is proposed. Firstly, the data is preprocessed to remove the load curve with high incompleteness, and then the PCA analysis method is used to reduce the dimension of the processed commercial user load curve, and then the improved SNN-DPC algorithm is used to calculate the distance matrix dist on the basis of constructing the sample point shared neighborhood set, replacing the distance matrix of the original algorithm as the input data. Then, based on the algorithm calculation of redefining SNN similarity, sample local density ρand distance from maximum density point δ, the inflection point is used to confirm the cluster center and complete the cluster analysis of the sampling curve. In short, the improved algorithm defines the similarity of the samples through the shared neighbors between the sample points, accurately analyzes some multi-dimensional heterogeneous load data, realizes the determination of the real clustering center point through the inflection point, and solves the problem of subjective will to select the clustering center, thus greatly improving the load clustering effect.The results show that:1) For the measured load data set of commercial users, this algorithm can accurately select the clustering center and has high operating efficiency.2) Compared with the traditional algorithm, the proposed load pattern recognition model based on the improved algorithm can better help power grid companies analyze the user’s electricity characteristics, and verify that the model can identify the typical load patterns of different commercial users more accurately.In summary, the strategy adopted in this paper exists in the real business user scenario.

HTML全文

参考文献(0)

施引文献

资源附件(0)