作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (21): 58-60,67. doi: 10.3969/j.issn.1000-3428.2011.21.020

• 软件技术与数据库 • 上一篇    下一篇

基于字典的保序字符串压缩改进方法

李海燕,夏小玲   

  1. (东华大学计算机科学与技术学院,上海 201620)
  • 收稿日期:2011-04-18 出版日期:2011-11-05 发布日期:2011-11-05
  • 作者简介:李海燕(1985-),女,硕士研究生,主研方向:数据压缩,查询处理与优化;夏小玲,副教授

Improved Order-preserving String Compression Method Based on Dictionary

LI Hai-yan, XIA Xiao-ling   

  1. (School of Computer Science and Technology, Donghua University, Shanghai 201620, China)
  • Received:2011-04-18 Online:2011-11-05 Published:2011-11-05

摘要: 传统基于字典的保序字符串压缩方法对数据的压缩和解压时间较长。为此,对编码索引CS-Prefix-Tree进行改进,根据字符串出现的概率,设计一种新的解码索引,从而减少查找时间,提高压缩性能。实验结果表明,与传统方法相比,改进方法的创建时间减少1/3,较大地降低内存消耗,查找时间降低近30%。

关键词: 字符串压缩, 共用叶子, 字典, 编码索引, 解码索引

Abstract: Data compression and decompression for traditional dictionary-based order-preserving string compression method has the shortcoming of long waiting time, so this paper improves CS-Prefix-Tree to reduce high memory consumption and time consuming while creating coding index. According to the probability of strings, it re-designs the decoding index to reduce its retrieving time. Experimental results demonstrate that the improvements reduces the creating time by 1/3, saves the memory consumption, and reduces the searching time by nearly 30%.

Key words: string compression, shared leaves, dictionary, encoding index, decoding index

中图分类号: