作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (14): 32-33. doi: 10.3969/j.issn.1000-3428.2010.14.012

• 软件技术与数据库 • 上一篇    下一篇

一种单遍扫描频繁模式树结构

谭 军1,2,卜英勇2,杨 勃2   

  1. (1. 中南林业科技大学计算机学院,长沙 412006;2. 中南大学机电工程学院,长沙 410083)
  • 出版日期:2010-07-20 发布日期:2010-07-20
  • 作者简介:谭 军(1971-),男,讲师、博士研究生,主研方向:数据库技术,数据挖掘;卜英勇,教授、博士生导师;杨 勃,讲师、博士研究生
  • 基金资助:
    国家自然科学基金资助项目“深海钴结壳微地形检测技术及最佳采集深度建模研究”(50474052)

Single-pass Frequent Pattern Tree Structure

TAN Jun1,2, BU Ying-yong2, YANG Bo2   

  1. (1. College of Computer Science, Central South University of Forestry and Technology, Changsha 412006;2. College of Mechanical Electrical Engineering, Central South University, Changsha 410083)
  • Online:2010-07-20 Published:2010-07-20

摘要: 针对频繁模式增长算法无法适应数据流的无限性和流动性的特点,提出一种新颖的FP-tree的变形结构-SP-tree,只需单遍扫描便能容纳全部数据库信息。为使SP-tree具有与FP-tree一样良好的压缩性能,给出一种有效的动态重构树的方法,称为宽度排序方法,该方法能够在挖掘过程中动态地逐条分支地重构树,最终产生一棵频繁递减的前缀树。实验结果表明,SP-tree的压缩性能优于其他单遍扫描的前缀树结构。

关键词: 数据流, 频繁模式增长算法, 单遍扫描模式树, 宽度排序方法

Abstract: Aiming at the problem that FP-growth algorithm requires two database scans, which are not consistent with efficient data stream processing, this paper presents a novel tree structure which is a variation of FP-tree, called SP-tree, which captures database information with one scan. For making SP-tree have the same compact performance, it presents an efficient dynamic tree restructuring method, called the breadth sorting method, which restructures a frequency-descending prefix-tree branch-by-branch. Experimental results show that compact performance of the SP-tree is better than other prefix-tree structure with one scan.

Key words: data stream, FP-growth algorithm, single-pass pattern tree, breadth sorting method

中图分类号: