作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (06): 89-90. doi: 10.3969/j.issn.1000-3428.2010.06.029

• 软件技术与数据库 • 上一篇    下一篇

基于链表数组的最大频繁项集挖掘算法

刘应东1,冷明伟2,陈晓云3   

  1. (1. 兰州交通大学交通运输学院,兰州 730070;2. 上饶师范学院数学与计算机系,上饶 334000;3. 兰州大学信息科学与工程学院,兰州 730000)

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-03-20 发布日期:2010-03-20

Maximal Frequent Itemsets Mining Algorithm Based on Linked List Array

LIU Ying-dong1, LENG Ming-wei2, CHEN Xiao-yun3   

  1. (1. School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070; 2. Department of Mathematics and Computer, Shangrao Normal University, Shangrao 334000; 3. School of Information Science and Engineering, Lanzhou University, Lanzhou 730000)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-20 Published:2010-03-20

摘要: 挖掘密集型数据集的全部频繁项集代价高昂,针对该问题,提出一种数据结构链表数组和基于链表数组的最大频繁项集快速生成算法。该方法使用链表数组为每个项目建立事务链表,并且链表的创建过程只需扫描数据库1次。使用深度优先搜索得到所有候选最大频繁项集,利用约束条件缩小搜索空间。使用标准数据集进行验证测试并与其他算法进行比较,实验结果表明,该算法具有较快的挖掘速度。

关键词: 数据挖掘, 最大频繁项集, 链表数组, 解空间

Abstract: Mining all frequent itemsets in dense datasets is very expensive. Aiming at this problem, linked list array, a new data structure, and a fast method of Mining Frequent Itemsets(MFI) based on it are proposed. This method creates linked list array for each item, only needs scan database one time, uses depth-first search strategy to generate all MFI. The algorithm reduces search space by using constraint condition. It demonstrates the algorithm with standard dataset, and the experimental results confirm that the mining algorithm can significantly improve the speed of mining MFI compared with other algorithms.

Key words: data mining, Maximal Frequent Itemsets(MFI), linked list array, solution space

中图分类号: