作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (1): 84-86. doi: 10.3969/j.issn.1000-3428.2009.01.028

• 软件技术与数据库 • 上一篇    下一篇

基于矩阵的频繁项集挖掘算法

张忠平,李 岩,杨 静   

  1. (燕山大学信息科学与工程学院,秦皇岛 066004)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-01-05 发布日期:2009-01-05

Frequent Itemsets Mining Algorithm Based on Matrix

ZHANG Zhong-ping, LI Yan, YANG Jing   

  1. (College of Information Science & Engineering, Yanshan University, Qinhuangdao 066004)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-01-05 Published:2009-01-05

摘要: 如何高效地挖掘频繁项集是关联规则挖掘的主要问题。该文根据集合论和矩阵理论,提出一种基于矩阵的频繁项集挖掘算法。该算法只需扫描数据库一次,就能把所有事务转化为矩阵的行,把所有项和项集转化为矩阵的列,在对矩阵操作时能一次性产生所有频繁项集,且当支持度阈值改变时无需重新扫描数据库。实验结果表明,该算法的挖掘效率高于Apriori算法。

关键词: 数据挖掘, 频繁项集, Apriori算法

Abstract: How to mine the frequent itemsets efficiently is a main problem in association rule mining. According to the theory of congregation and matrix, a frequent itemsets mining algorithm based on matrix is proposed. Through scanning database only once, all transactions are transformed to be rows of matrix and all items and itemsets are transformed to be columns of matrix. This algorithm can one-off product all frequent itemsets, and need not rescan the database when support threshold value changes. Experimental results show the mining efficiency of this algorithm is higher than Apriori algorithm.

Key words: data mining, frequent itemsets, Apriori algorithm

中图分类号: