作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (22): 86-87. doi: 10.3969/j.issn.1000-3428.2010.22.030

• 软件技术与数据库 • 上一篇    下一篇

数据流最大频繁项挖掘方法

张月琴,陈 东   

  1. (南京工业大学电子与信息工程学院,南京 210009)
  • 出版日期:2010-11-20 发布日期:2010-11-18
  • 作者简介:张月琴(1975-),女,讲师、硕士,主研方向:人工智能,数据挖掘;陈 东,讲师、硕士
  • 基金资助:
    南京工业大学青年教师学术基金资助项目(39709013)

Mining Method of Data Stream Maximum Frequent Itemsets

ZHANG Yue-qin, CHEN Dong   

  1. (College of Electronic and Information Engineering, Nanjing University of Technology, Nanjing 210009, China)
  • Online:2010-11-20 Published:2010-11-18

摘要:

提出基于事务矩阵挖掘最大频繁项集的方法AFMI,该方法采取迭代精简事务矩阵的方式求解所有事务中的最大频繁项集,从精简后的事务向量交集的子集中搜索最大频繁项集,并运用逻辑运算和剪枝方法提高挖掘效率。基于AFMI方法,研究挖掘滑动窗口数据流最大频繁项集算法AFMI+,该算法可使用户周期性地挖掘当前窗口中的最大频繁项集。实验结果表明,AFMI和AFMI+算法均具有较好的性能。

关键词: 数据挖掘, 数据流, 滑动窗口, 最大频繁项集, 矩阵

Abstract:

A method called AFMI based on a transaction matrix is proposed to mine the maximum frequent itemsets. The frequent itemsets are obtained from all the transactions by means of condensing iteratively the transaction matrix, the transaction vector intersections are acquired to reduce the range of search. Logical operations and pruning methods are adopted to improve the efficiency of the mining. Based on AFMI, an algorithm called AFMI+ is proposed, which can mine maximum frequent itemsets from a sliding window over data streams. AFMI+ can get the maximum frequent itemsets in current sliding window over data streams just when users need to get them periodically. Experimental result shows that AFMI and AFMI+ algorithms have better performance.

Key words: data mining, data stream, sliding window, maximum frequent itemsets, matrix

中图分类号: