作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (3): 28-30. doi: 10.3969/j.issn.1000-3428.2012.03.010

• 软件技术与数据库 • 上一篇    下一篇

缩减投影数据库规模的增量式序列模式算法

刘佳新a,严书亭b,任家东a   

  1. (燕山大学 a. 信息科学与工程学院;b. 科学技术研究院,河北 秦皇岛 066004)
  • 收稿日期:2011-06-03 出版日期:2012-02-05 发布日期:2012-02-05
  • 作者简介:刘佳新(1978-),女,博士研究生,主研方向:数据挖掘;严书亭,硕士;任家东,教授、博士生导师
  • 基金资助:

    河北省教育厅科学研究计划基金资助项目(2008498); 河北省自然科学基金资助项目(F2010001298);秦皇岛市科学技术研究与发展计划基金资助项目(201001 A018);

Incremental Sequential Pattern Algorithm of Reducing Projected Database Size

LIU Jia-xin a, YAN Shu-ting b, REN Jia-dong a   

  1. (a. College of Information Science and Engineering; b. Science and Technology Administration Office, Yanshan University, Qinhuangdao 066004, China)
  • Received:2011-06-03 Online:2012-02-05 Published:2012-02-05

摘要: 在增量式序列模式挖掘算法中,数据库更新只有插入和扩展2种操作,未考虑序列删除的情况。为此,提出一种基于频繁序列树的增量式序列模式更新算法(IUFST)。在数据库和支持度发生变化时,IUFST算法分不同情况对频繁序列树进行更新操作,缩减投影数据库的规模,提高算法效率。实验结果表明,该算法在时间性能上优于PrefixSpan算法和IncSpan算法。

关键词: 数据挖掘, 增量式挖掘, 序列模式, 投影数据库, 频繁序列树, 深度优先

Abstract: This paper proposes an incremental sequential patterns updating algorithm based on frequent sequence tree, called IUFST, in order to solve the problem that when the database is updated, the existed incremental mining algorithms of sequential patterns only mention two kinds of database updates, insert and append rather than the delete operation. When the database is updated and the support is changed, IUFST is divided into four kinds of situations to update the frequent sequence tree. It reduces the size of the projected database and improves the efficiency. Experimental results show that IUFST outperforms PrefixSpan and IncSpan in time cost.

Key words: data mining, incremental mining, sequential pattern, projected database, frequent sequence tree, depth-first

中图分类号: