摘要: 在会话流中挖掘Top-k闭序列模式,存在因相关比率ρ的大小而导致的内存消耗和挖掘精度之间的冲突。基于False-Negative方法,提出Tstream算法,制定2种约束策略限制ρ。基于该策略设计加权调和计数函数,渐进计算每个模式的支持度。实验结果证明了该算法的有效性。
关键词:
Top-k闭序列模式,
加权调和平均数,
调节因子
Abstract: The current methods in session streams for mining Top-k Closed Sequential Pattern(Topk_CSP) may lead to a conflict between output precision and memory consumption because of using ρ. This paper proposes TStream algorithm, which is based on False-Negative approach. TStream utilizes two constraint strategies to restrict ρ, and employs a weighted harmonic count function to calculate the support of each pattern progressively. Experimental results show that the algorithm is efficient.
Key words:
Top-k Closed Sequential Pattern(Topk_CSP),
Weighted Harmonic Average(WHA),
regulatory factor
中图分类号:
彭慧丽;张啸剑. 会话流中Top-k闭序列模式的挖掘[J]. 计算机工程, 2009, 35(19): 86-87,9.
PENG Hui-li; ZHANG Xiao-jian. Top-k Closed Sequential Pattern Mining in Session Streams[J]. Computer Engineering, 2009, 35(19): 86-87,9.