摘要: 在增量式序列模式挖掘算法中,数据库更新只有插入和扩展2种操作,未考虑序列删除的情况。为此,提出一种基于频繁序列树的增量式序列模式更新算法(IUFST)。在数据库和支持度发生变化时,IUFST算法分不同情况对频繁序列树进行更新操作,缩减投影数据库的规模,提高算法效率。实验结果表明,该算法在时间性能上优于PrefixSpan算法和IncSpan算法。
关键词:
数据挖掘,
增量式挖掘,
序列模式,
投影数据库,
频繁序列树,
深度优先
Abstract: This paper proposes an incremental sequential patterns updating algorithm based on frequent sequence tree, called IUFST, in order to solve the problem that when the database is updated, the existed incremental mining algorithms of sequential patterns only mention two kinds of database updates, insert and append rather than the delete operation. When the database is updated and the support is changed, IUFST is divided into four kinds of situations to update the frequent sequence tree. It reduces the size of the projected database and improves the efficiency. Experimental results show that IUFST outperforms PrefixSpan and IncSpan in time cost.
Key words:
data mining,
incremental mining,
sequential pattern,
projected database,
frequent sequence tree,
depth-first
中图分类号:
刘佳新, 严书亭, 任家东. 缩减投影数据库规模的增量式序列模式算法[J]. 计算机工程, 2012, 38(3): 28-30.
LIU Jia-Xin, YAN Shu-Ting, LIN Jia-Dong. Incremental Sequential Pattern Algorithm of Reducing Projected Database Size[J]. Computer Engineering, 2012, 38(3): 28-30.