作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (18): 59-61. doi: 10.3969/j.issn.1000-3428.2009.18.021

• 软件技术与数据库 • 上一篇    下一篇

一种有效的并行序列模式挖掘算法

田卫东,姜海辉   

  1. (合肥工业大学计算机与信息学院,合肥 230009)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-09-20 发布日期:2009-09-20

Effective Mining Algorithm for Parallel Sequential Patterns

TIAN Wei-dong, JIANG Hai-hui   

  1. (School of Computer & Information, Hefei University of Technology, Hefei 230009)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-09-20 Published:2009-09-20

摘要: 为解决共享存储的并行计算环境下挖掘序列模式时存在的处理器负载不平衡及缺少有效剪枝策略的问题,提出采用动态任务分配的办法来平衡处理器之间的工作负载,利用并行局部剪枝技术消除投影数据库的重复生成与计算以提高挖掘效率。设计一种基于共享存储SMP系统的并行序列模式挖掘算法PFSPAN。算法分析和实验结果表明,PFSPAN能够有效地挖掘序列模式。

关键词: 数据挖掘, 序列模式, 并行处理, 任务分配, 局部剪枝

Abstract: Under the parallel computer environment with shared memory, imbalance of processors’ workload and lack of effective pruning methods are two key problems in mining sequential patterns. To solve these problems, dynamic tasks distribution method for achieving balance of workload among processors and the parallel local pruning technology used to improve the mining efficiency by avoiding abundant duplicated projected databases are proposed. A parallel algorithm for mining sequential patterns using these two methods based on Symmetric MultiProcessor(SMP) computer system, Parallel Fast Sequential Pattern mining algorithm(PFSPAN) is proposed in this paper. Both theoretical analyses and practical experiments show that PFSPAN can mine sequential patterns effectively.

Key words: data mining, sequential patterns, parallel disposal, task distribution, local pruning

中图分类号: