作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (17): 56-58,6. doi: 10.3969/j.issn.1000-3428.2008.17.021

• 软件技术与数据库 • 上一篇    下一篇

基于两个矩阵的关联规则挖掘优化算法

何建忠,吕振俊   

  1. (上海理工大学计算机工程学院,上海 200093)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-09-05 发布日期:2008-09-05

Optimized Algorithm for Mining Association Rule Based on Two Matrixes

HE Jian-zhong, LV Zhen-jun   

  1. (College of Computer Engineering, Shanghai University of Science and Technology, Shanghai 200093)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-09-05 Published:2008-09-05

摘要: 针对传统数据挖掘算法的不足,提出基于两个矩阵的优化关联规则挖掘算法。该算法对事务数据库进行一次扫描,将其转换成两个用于存放逻辑数据的矩阵,并保留项目间的关联信息。对两个矩阵进行挖掘,基于矩阵MA得到频繁1-项集和频繁2-项集,基于矩阵MB得到最大频繁项集,其他频繁k-项集基于两个矩阵和已得频繁集获取。该算法极大减少了候选频繁集数量,挖掘过程采用逻辑运算。实验结果证明了其可行性和高效性。

关键词: 频繁项集, 关联规则, 矩阵

Abstract: Aiming at the weakness of traditional data mining algorithm, this paper presents an optimized algorithm for mining association rule based on two matrixes. This algorithm can convert a transaction database into two matrixes through scanning the database only once, and does the mining work of two matrixes that include all information of items in database. The two matrixes are better in memory because they store logic data only. Frequent 1-itemsets and frequent 2-itemsets can be got directly based on Matrix MA and maximum frequent itemset can be got based on Matrix MB. Other frequent k-itemset is got based on the two matrix and the gotten frequent itemset, and the number of candidate itemset can be reduced. Logic operation is adopted so that the algorithm has predominance in efficiency. Experiments show that the algorithm is feasible and efficient.

Key words: frequent itemset, association rule, matrix

中图分类号: