Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2012, Vol. 38 ›› Issue (12): 39-41. doi: 10.3969/j.issn.1000-3428.2012.12.011

• Networks and Communications • Previous Articles     Next Articles

Efficient Incremental Mining Algorithm of Sequential Patterns

LIU Jia-xin   

  1. (Library, Yanshan University, Qinhuangdao 066004, China)
  • Received:2012-03-19 Online:2012-06-20 Published:2012-06-20

一种高效的增量式序列模式挖掘算法

刘佳新   

  1. (燕山大学图书馆,河北 秦皇岛 066004)
  • 作者简介:刘佳新(1978-),女,博士研究生,主研方向:数据挖掘
  • 基金资助:
    国家自然科学基金资助项目(61170190);秦皇岛市科学 技术研究与发展计划基金资助项目(201001A018)

Abstract: In order to solve the problem that the existed incremental mining algorithms need to mine the sequence database once again, and reduce the time and space consumption generated by repeatly running mining algorithm in the process of the sequential pattern mining, this paper proposes an efficient incremental mining algorithm of sequential patterns. It uses the frequent sequence tree as the storage structure of the algorithm. When the sequence database is updated and the minimum support is changed, it updates the frequent sequence tree by performing the update operation. It finds all the sequential patterns through using depth-first search strategy to traverse the frequent sequence tree. Experimental results show that the algorithm outperforms IncSpan and PrefixSpan in time cost.

Key words: data mining, incremental mining, sequential pattern, project database, frequent sequence tree

摘要: 现有的增量式挖掘算法在支持度发生变化时,需要对序列数据库进行重复挖掘,为减少由此产生的时空消耗,提出一种高效的增量式序列模式挖掘算法。算法采用频繁序列树作为序列存储结构,当序列数据库和最小支持度发生变化时,通过执行更新操作,实现频繁序列树的更新,利用深度优先遍历频繁序列树找到序列数据库中所有的序列模式。实验结果表明,与IncSpan算法和PrefixSpan算法相比,该算法的挖掘效率较高。

关键词: 数据挖掘, 增量式挖掘, 序列模式, 投影数据库, 频繁序列树

CLC Number: