面向智能电网的数据密集型云存储策略
A Smart Grid-oriented Data Placement Strategy for Data-intensive Cloud Environment
-
摘要: 智能电网环境下数据密集型应用往往涉及跨数据中心的数据传输和数据中心内的数据迁移,这对数据分布提出了新的挑战。为了充分利用计算存储资源,满足智能电网大规模数据的可靠存储和高效处理的实际需求,提出了基于云计算的数据密集型存储方法,该方法将数据集映射成数据空间的点集。设计了两阶段分类过程:第1阶段基于传统的K均值算法实现点集的初始分类;第2阶段针对各数据集与初始聚类的隶属关系,引入数据迁移的代价函数,对初始分类进行调节,实现数据集到数据中心的布局方案。实验结果表明,该算法能够有效提高数据存取效率并兼顾全局负载均衡。Abstract: In a distributed environment of the smart grid,data-intensive applications often involve complex transmissions between and within the data centers which may have to use large amounts of datasets.An application may need several datasets located in different data centers facing great challenges including the high cost of data movement between data centers and data dependency within the same center.Considering the efficient storage and management of large scale data in the smart grid,a two-stage strategy is proposed for the data placement.In the first stage,an initial classification is achieved by the K-means;while in the second,datasets are placed in different centers by a clustering scheme based on the data dependency.Simulations show that the algorithm can effectively reduce the cost of data movement while performing an even data distribution.