作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (12): 55-58. doi: 10.3969/j.issn.1000-3428.2012.12.016

• 软件技术与数据库 • 上一篇    下一篇

基于BIDE的多核并行闭合序列模式挖掘

俞东进 1,郑苏杭 1,李万清 1,吴 为 2   

  1. (1. 杭州电子科技大学计算机学院,杭州 310018;2. 浙江省网络系统及信息安全重点实验室,杭州 310006)
  • 收稿日期:2011-07-18 出版日期:2012-06-20 发布日期:2012-06-20
  • 作者简介:俞东进(1969-),男,教授级高级工程师、博士、CCF会员,主研方向:数据挖掘,数据库技术;郑苏杭,硕士研究生;李万清,讲师;吴 为,高级工程师
  • 基金资助:
    浙江省重大科技计划基金资助项目(2008C11099-1);浙江省网络系统及信息安全重点实验室基金资助项目

Multi-core Parallel Closed Sequential Patterns Mining Based on BIDE

YU Dong-jin 1, ZHENG Su-hang 1, LI Wan-qing 1, WU Wei 2   

  1. (1. School of Computer, Hangzhou Dianzi University, Hangzhou 310018, China; 2. Zhejiang Key Laboratory of Network System and Information Security, Hangzhou 310006, China)
  • Received:2011-07-18 Online:2012-06-20 Published:2012-06-20

摘要: 基于经典的BIDE算法,提出一种多核并行闭合序列模式挖掘算法——MT_BIDE。该算法在频繁序列扩展判断前进行剪枝,在扩展过程中动态调整频繁序列及其伪投影数据集,平衡不同线程间挖掘闭合序列模式的计算量差异。实验结果表明,该算法具有较高的运行效率和加速比。

关键词: 多核, 闭合序列, BIDE算法, 序列模式挖掘, 伪投影数据集

Abstract: Based on the classical BIDE algorithm, this paper presents a multi-core parallel closed sequential patterns mining parallel algorithm, MT_BIDE. Through pruning before frequent sequential patterns expansion and reassigning them and their pseudo-projected datasets during the expansion process, MT_BIDE achieves the workload balancing which is always influenced by different calculation of different threads mining closed sequential patterns. Experimental results show that the algorithm has higher operating efficiency and speedup ratio.

Key words: multi-core, closed sequence, BIDE algorithm, sequential pattern mining, pseudo-projected dataset

中图分类号: