摘要: 针对基于Bloom过滤器的位图索引方法查询结果不精确的问题,提出一种精确位图索引算法——FPT-Index。该算法采用Bloom过滤器对基本位图索引进行压缩,同时引入假阳表,对查询结果进行筛选,从而达到精确查询的目的。通过理论分析得出,在给定关键词出现频率的前提条件下,可计算出最小压缩率以及所需哈希函数的个数。实验结果表明,FPT-Index相较于WAH方法在压缩率和查询效率两方面都有较好的表现。
关键词:
位图索引,
Bloom过滤器,
假阳率,
假阳表,
压缩率,
查询效率
Abstract: A precise bitmap index named FPT-Index is proposed to solve the problem that the query results are not precise in approximate bitmap index base on bloom filter. FPT-Index uses bloom filter to compress the basic bitmap index and introduces false positive table to screen the query results. The query results from FPT-Index are precise. Through theoretical analysis, the minimum of compression ratio and the corresponding number of hash functions can be worked out when the keyword frequency is confirmed. Experimental results show that FPT-Index does better in compression ratio and search performance than WAH.
Key words:
bitmap index,
Bloom Filter(BF),
false positive probability,
false positive table,
compression ratio,
search efficiency
中图分类号:
肖琳, 梁军, 钮文良. 基于Bloom过滤器的精确位图索引[J]. 计算机工程, 2011, 37(13): 272-274,278.
XIAO Lin, LIANG Jun, CHOU Wen-Liang. Precise Bitmap Index Based on Bloom Filter[J]. Computer Engineering, 2011, 37(13): 272-274,278.