计算机工程 ›› 2009, Vol. 35 ›› Issue (19): 86-87,9.doi: 10.3969/j.issn.1000-3428.2009.19.028

• 软件技术与数据库 • 上一篇    下一篇

会话流中Top-k闭序列模式的挖掘

彭慧丽1,张啸剑2   

  1. (1. 河南省直广播电视大学教务科,郑州 450008;2. 河南财经学院计算机系,郑州 450002)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-10-05 发布日期:2009-10-05

Top-k Closed Sequential Pattern Mining in Session Streams

PENG Hui-li1, ZHANG Xiao-jian2   

  1. (1. Department of Education, Henan Radio & Television University, Zhengzhou 450008;
    2. Department of Computer Science, Henan University of Finance & Economics, Zhengzhou 450002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-10-05 Published:2009-10-05

摘要: 在会话流中挖掘Top-k闭序列模式,存在因相关比率ρ的大小而导致的内存消耗和挖掘精度之间的冲突。基于False-Negative方法,提出Tstream算法,制定2种约束策略限制ρ。基于该策略设计加权调和计数函数,渐进计算每个模式的支持度。实验结果证明了该算法的有效性。

关键词: Top-k闭序列模式, 加权调和平均数, 调节因子

Abstract: The current methods in session streams for mining Top-k Closed Sequential Pattern(Topk_CSP) may lead to a conflict between output precision and memory consumption because of using ρ. This paper proposes TStream algorithm, which is based on False-Negative approach. TStream utilizes two constraint strategies to restrict ρ, and employs a weighted harmonic count function to calculate the support of each pattern progressively. Experimental results show that the algorithm is efficient.

Key words: Top-k Closed Sequential Pattern(Topk_CSP), Weighted Harmonic Average(WHA), regulatory factor

中图分类号: