变电设备状态监测大数据的查询优化方法
Optimal Query Method of Big Data for Status Monitoring of Substation Equipment
-
摘要: 变电设备状态监测数据体积大、价值密度低,传统数据处理方法不能很好地满足状态监视、评估与诊断等应用快速查询的需要。文中通过对状态监测数据特点和分布式列数据存储方法的分析,给出了变电设备状态监测的大数据处理框架。通过对监测时间、监测设备编号和设备编号等数据属性的组合,设计了3种状态监测数据复合行键结构,以提高状态监测数据行键查询的灵活性。为了解决在行键未知情况下全表扫描效率低下的问题,提出基于协处理器的二级索引构建方法,实现在非行键约束条件下的快速查询。实验结果表明,基于协处理器的二级索引方法在查询效率上比无索引和IHBase二级索引方式有了明显提高,对状态监测数据写入速度影响较小,能够较好地满足大数据环境下变电设备状态监测大数据快速、灵活查询的需要。Abstract: The status monitoring data of substation equipment have the characteristics of big volume and low value density,since the traditional data processing method have some limitations in fast query for status monitoring,evaluation and diagnosis.By analyzing the characteristics of status monitoring data and the distributed data storage method,a big data processing framework for the status monitoring of the substation equipment is presented.By combing data attributes such as acquisition time,device code and linked device code,three kinds of status monitoring row key are designed to improve the flexibility of row key query.In order to solve the low efficiency of full table scan problem when the row key is unknown,a secondary index method based on coprocessor is proposed to achieve fast query under non-row key constraint conditions.The experimental results show that,the secondary index based on coprocessor has an obvious improvement on the query efficiency compared with the non-index and the IHBase secondary index,the impact of writing speed on the status monitoring data is insignificant,and it can better meet the needs of fast and flexible query of status monitoring data in big data environment.