作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (06): 65-68. doi: 10.3969/j.issn.1000-3428.2007.06.023

• 软件技术与数据库 • 上一篇    下一篇

基于编码频繁模式树的序列模式挖掘算法

胥春艳   

  1. (天津大学管理学院,天津 300072)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-03-20 发布日期:2007-03-20

Sequential Patterns Mining Algorithm Based on Coded Frequent Pattern-tree

XU Chunyan   

  1. (School of Management, Tianjin University, Tianjin 300072)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-03-20 Published:2007-03-20

摘要: 提出了同时适用于一维和多维序列数据的统一存储结构——编码频繁模式树(CFP-tree),并通过渐进的前缀序列搜索方式来发现频繁序列模式,避免了在挖掘过程中递归地产生大量的中间子序列。实验证明,该算法在大规模数据的处理上比现有序列模式挖掘算法有更好的性能。

关键词: 数据挖掘, 序列模式, 多维度序列

Abstract: This paper proposes a unified coded frequent pattern-tree (CFP-tree) structure to store both 1-dimensional and multidimensional sequence data. The proposed algorithm finds frequent sequential patterns through progressive prefix sequence search and avoids recursively to generate a great deal of intermediate subsequences. Experiments show great performance gains over existing sequential pattern mining algorithms, especially for large database.

Key words: Data mining, Sequential pattern, Multi-dimensional sequence

中图分类号: