摘要: 针对金融业务中实时数据库的数据存储特点,提出结构混合压缩(SMC)算法。SMC算法利用金融数据具有纯文本、数据分散和数据项内重复少的特点,以哈夫曼编码作为算法基础,根据词频将单字和词组混合,在哈夫曼树中引入数组结构,对文本数据进行压缩。测试结果表明,SMC算法的平均数据压缩率比原始哈夫曼算法提高了约13%。
关键词:
数据压缩,
压缩算法,
哈夫曼编码
Abstract: This paper gives a new data compression algorithm——Structured Mixed Compression(SMC) algorithm, which can adapt to the characteristics of real-time database used in financial business. In financial business, the data are dispersed, formatted as text, and there are few duplicate fields in a same data item, so the SMC algorithm mixes words and phrases according to the word frequency, and imports the array to the Huffman tree to compress the business data. According to the test, result shows that the average compression ratio by using SMC algorithm is 15% more than that by using Huffman encoding.
Key words:
data compression,
compression algorithm,
Huffman encoding
中图分类号:
贾永洁;王耀强;郑 骏. 金融业务数据库的数据压缩方法[J]. 计算机工程, 2008, 34(11): 281-282.
JIA Yong-jie; WANG Yao-qiang; ZHENG Jun. Data Compression Method of Financial Business Database[J]. Computer Engineering, 2008, 34(11): 281-282.