一种改进的增量挖掘算法

doi:10.3969/j.issn.1000-3428.2010.24.015

计算机工程 ›› 2010, Vol. 36 ›› Issue (24): 42-44. doi: 10.3969/j.issn.1000-3428.2010.24.015

一种改进的增量挖掘算法

李春喜，赵雷

(苏州大学计算机科学与技术学院，江苏苏州 215006)

出版日期:2010-12-20 发布日期:2010-12-14
作者简介:李春喜(1984－)，男，硕士研究生，主研方向：数据库技术，数据挖掘；赵雷(通讯作者)，副教授
基金资助:
国家自然科学基金资助项目(61073061)

Improved Incremental Mining Algorithm

LI Chun-xi, ZHAO Lei

(School of Computer Science and Technology, Soochow University, Suzhou 215006, China)

Online:2010-12-20 Published:2010-12-14

摘要/Abstract

摘要：

Pre-FUFP算法基于次频繁项的概念有效处理了频繁模式树的更新，但当有次频繁项变成频繁项时，需要判定原数据库中哪些事务包含该数据项。为此，通过引入次频繁项对应原事务标识符的索引确定需要处理原数据库的事务，减少这一过程所消耗的时间，并用基于压缩FP-tree和矩阵技术代替原始FP-growth挖掘出频繁模式。实验证明该算法在时间效率上较Pre-FUFP有大幅度提高。

关键词: 频繁模式, 次频繁项集, 增量挖掘

Abstract:

Pre-FUFP algorithm updates the frequent pattern tree effectively based on the concept of pre-large items. But when there are pre-large items becoming frequent items, the algorithm need check which transactions in the original database contains the pre-large items. In this paper, an index table of pre-large items to their corresponding original transactions is proposed to find out the transactions need to be processed and fasten the process of FUFP-tree modification. The frequent patterns by using compact FP-Tree and matrix based algorithm are worked out. Experimental result shows the algorithm outperforms the pre-FUFP algorithm.

Key words: frequent pattern, pre-large itemsets, incremental mining

中图分类号:

TP311.52

李春喜, 赵雷. 一种改进的增量挖掘算法[J]. 计算机工程, 2010, 36(24): 42-44.

LI Chun-Chi, DIAO Lei. Improved Incremental Mining Algorithm[J]. Computer Engineering, 2010, 36(24): 42-44.

http://www.ecice06.com/CN/Y2010/V36/I24/42

[1]	高权,万晓冬. 基于负载均衡的并行FP-Growth算法[J]. 计算机工程, 2019, 45(3): 32-35,40.
[2]	王菊,刘付显,靳春杰. 基于修正BPSO的通用模式指标上界估算方法[J]. 计算机工程, 2018, 44(10): 168-174.
[3]	罗明,孟传伟,黄海量. 基于加权频繁模式树的通信网络告警规则挖掘方法[J]. 计算机工程, 2016, 42(4): 190-196.
[4]	陈文. 基于Fp树的加权频繁模式挖掘算法[J]. 计算机工程, 2012, 38(06): 63-65.
[5]	张广路, 雷景生, 吴兴惠. 界标窗口中数据流频繁模式挖掘算法研究[J]. 计算机工程, 2012, 38(01): 55-58,61.
[6]	花红娟, 张健, 陈少华. 基于频繁模式树的约束最大频繁项集挖掘算法[J]. 计算机工程, 2011, 37(9): 78-80.
[7]	神鹏飞, 王希武, 耿志广, 王创伟, 李国良. 一种无阈值的频繁模式生成算法[J]. 计算机工程, 2011, 37(8): 31-33.
[8]	廖豪, 陈洁, 谭建龙. 大规模语料中频繁模式增量发现算法[J]. 计算机工程, 2011, 37(23): 27-29,32.
[9]	林颖. 基于闭合序列模式的减量挖掘算法[J]. 计算机工程, 2011, 37(22): 64-66.
[10]	唐辉, 吴明礼, 贺玉明. 一种改进的多层关联规则挖掘算法[J]. 计算机工程, 2011, 37(16): 42-44.
[11]	田王君, 蒋军辉, 陈士慧. 基于矩阵技术的频繁项目集挖掘算法[J]. 计算机工程, 2011, 37(16): 80-81.
[12]	张海清, 刘胤田. 最大亚频繁模式挖掘算法研究[J]. 计算机工程, 2011, 37(14): 62-64.
[13]	赵传申;何顺刚;杨吉宏;陈丽霞. 基于多分类-关联规则的数据流分类算法[J]. 计算机工程, 2010, 36(9): 38-40.
[14]	袁正午, 程宇翔, 梁均军, 李林. 基于流立方体的数据流频繁模式挖掘算法[J]. 计算机工程, 2010, 36(22): 43-45.
[15]	谭军, 卜英勇, 杨勃. 一种单遍扫描频繁模式树结构[J]. 计算机工程, 2010, 36(14): 32-33.

选择文件类型/文献管理软件名称

选择包含的内容

一种改进的增量挖掘算法

Improved Incremental Mining Algorithm

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

一种改进的增量挖掘算法

Improved Incremental Mining Algorithm

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价