作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (13): 272-274,278. doi: 10.3969/j.issn.1000-3428.2011.13.090

• 开发研究与设计技术 • 上一篇    下一篇

基于Bloom过滤器的精确位图索引

肖 琳1,梁 军2,钮文良1   

  1. (1. 北京联合大学应用科技学院,北京 102200;2. 北京联合大学电子信息技术实验实训基地,北京 100101)
  • 收稿日期:2011-04-18 出版日期:2011-07-05 发布日期:2011-07-05
  • 作者简介:肖 琳(1980-),女,讲师、博士,主研方向:信息检索,网络通信;梁军,副教授;钮文良,教授

Precise Bitmap Index Based on Bloom Filter

XIAO Lin  1, LIANG Jun  2, NIU Wen-liang   1   

  1. (1. College of Applied Science and Technology, Beijing Union University, Beijing 102200, China; 2. Training Base of Electronic Information Technology, Beijing Union University, Beijing 100101, China)
  • Received:2011-04-18 Online:2011-07-05 Published:2011-07-05

摘要: 针对基于Bloom过滤器的位图索引方法查询结果不精确的问题,提出一种精确位图索引算法——FPT-Index。该算法采用Bloom过滤器对基本位图索引进行压缩,同时引入假阳表,对查询结果进行筛选,从而达到精确查询的目的。通过理论分析得出,在给定关键词出现频率的前提条件下,可计算出最小压缩率以及所需哈希函数的个数。实验结果表明,FPT-Index相较于WAH方法在压缩率和查询效率两方面都有较好的表现。

关键词: 位图索引, Bloom过滤器, 假阳率, 假阳表, 压缩率, 查询效率

Abstract: A precise bitmap index named FPT-Index is proposed to solve the problem that the query results are not precise in approximate bitmap index base on bloom filter. FPT-Index uses bloom filter to compress the basic bitmap index and introduces false positive table to screen the query results. The query results from FPT-Index are precise. Through theoretical analysis, the minimum of compression ratio and the corresponding number of hash functions can be worked out when the keyword frequency is confirmed. Experimental results show that FPT-Index does better in compression ratio and search performance than WAH.

Key words: bitmap index, Bloom Filter(BF), false positive probability, false positive table, compression ratio, search efficiency

中图分类号: