基于时间序列的配电网数据清理和融合方法研究

朱有产; 梁玮轩; 王英姿

doi:10.13335/j.1000-3673.pst.2020.0886

基于时间序列的配电网数据清理和融合方法研究

Research on Data Cleaning and Fusion in Distribution Power Grid Based on Time Series Technology

摘要

摘要: 随着配电物联网的快速发展，海量异构数据不断地从生产、传输、消费端产生，这些数据具有更新速度快、质量差、价值密度低、时间序列性强的特点。如何从这些海量数据中提取高质量的有价值数据，减少数据冗余，需要有效的数据清洗融合方法。为此，提出了一种基于时间序列相似性度量的数据清理、融合方法，该方法利用近似符号聚集、欧式算法和调整相似度加权的相似序列完成数据清理，利用多元异构数据融合算法完成数据融合。选用1440点负荷数据进行实验，结果表明，该方法可以检测配电网异常数据、填充空缺数据、减少数据冗余、融合异构数据，处理后的数据精度高，计算复杂度低，整体提高了数据质量，为配电物联网应用提供可靠的基础数据。

Abstract: With the rapid development of the Internet of things in power distribution, massive heterogeneous data are constantly generated from the production, transmission and consumption end. These data have the characteristics of fast update speed, poor quality, low value density and strong time sequence. How to extract high quality valuable data from these massive data and reduce data redundancy requires effective data cleaning and fusion methods. Therefore, a data cleaning and fusion method based on the similarity measurement of time series is proposed. This method uses the symbol aggregate approximation (SAX), the Euclidean algorithm, the similarity weighted similar sequence to complete the data cleaning, and the multiple heterogeneous data fusion algorithm to complete the data fusion. The 1440 points load data is selected for the experiment. The results show that the method can detect the abnormal data, fill in the vacant data, reduce the data redundancy, and integrate the heterogeneous data. The processed data has high precision, low computational complexity, and improves the data quality, providing reliable basic data for the application of the distribution IoT.

HTML全文

参考文献(25)

施引文献

资源附件(0)