摘要: 在研究数据流过程中,基于现有的概要数据结构Bloom Filter,给出改进的KBloom Filter结构,从理论上对假阳性误判进行分析,得出两者具有相同的在误判率f0下表示集合规模的上限n0,因此,KBloom Filter的误判率在可控范围内。提出基于KBloom Filter的流计数算法,与基于Bloom Filter的流计数算法相比,在相同的空间复杂度O(m)和插入操作时间复杂度O(k)情况下,该算法降低了统计结果的误差。
关键词:
数据流,
布鲁姆过滤器,
概要数据结构
Abstract: During the process of studying data stream, this paper proposes the KBloom Filter structure which is based on the existing synopsis data structure, analyses the false positive misjudgment theoretically, concludes that both can present the upper limits of set scale n0under the same misjudgment rate f0, so the misjudgment rate of KBloom Filter is within the controllable range. The flow counting algorithm based on the KBloom Filter structure is gived, under the same space complexity O(m) and insert time complexity O(k), compared to the flow counting algorithm based on the Bloom Filter structure, the error of the counting result is lower.
Key words:
data stream,
Bloom Filter,
synopsis data structure
中图分类号:
廖豪, 梁峰, 谭建龙. 一种面向数据流模型的流计数算法[J]. 计算机工程, 2010, 36(23): 31-33,35.
LIAO Hao, LIANG Feng, TAN Jian-Long. Flow Counting Algorithm for Data Stream Model[J]. Computer Engineering, 2010, 36(23): 31-33,35.