摘要: 目前多数数据压缩算法不能直接在压缩结果上进行数据查询,大数据的线性化压缩算法虽然可直接在压缩后的数据上进行邻接关系查询,但压缩率较低。针对该问题,对线性化压缩的实现原理进行研究,分析MPk线性化算法在不同社会网络样本下的压缩效率,发现线性化压缩结果中存在冗余信息,并针对该情况设计改进算法,删去原有数据结构中的冗余部分,进一步提高压缩率。实验结果证明,改进算法的时间复杂度与原算法相同,压缩率平均提升23%。
关键词:
线性化压缩算法,
大数据,
社会网络,
启发式算法,
Eulerian数据结构
Abstract: Nowadays,most data compression algorithms do not support performing query directly on compressed data.Though the compression algorithm can perform query neighbor relations on compressed result,the compression ratio is relatively low.To solve the problem,this paper does some research on the principle of linearization compression algorithm.It analyzes the compression ratio of MPk algorithm in different sample social network and finds the redundant information in compressed result.To eliminate these redundant information,it improves the original data structure,removes the unnecessary bits and improves the compression ratio.Experiments show that the proposed algorithm has same time complexity compared with primal algorithm,but the compression ratio can be increased by 23% in average.
Key words:
linearization compression algorithm,
big data,
social network,
heuristic algorithm,
Eulerian data structure
中图分类号:
高圣巍,彭超. 一种改进的邻接关系可查询压缩算法[J]. 计算机工程, 2015, 41(1): 61-64.
GAO Shengwei,PENG Chao. An Improved Compression Algorithm Supporting Neighbor Query[J]. Computer Engineering, 2015, 41(1): 61-64.