摘要: 频繁闭合模式集可唯一确定频繁模式完全集。根据数据流的特点,提出一种挖掘频繁闭合项集的算法,该算法将数据流分段,用DSFCI_tree动态存储潜在频繁闭合项集,对每一批到来的数据流,建立局部DSFCI_tree,进而对全局DSFCI_tree进行更新并剪枝,从而有效地挖掘整个数据流中的频繁闭合模式。实验表明,该算法具有良好的时间和空间效率。
关键词:
数据挖掘,
数据流,
关联规则,
频繁闭合项集
Abstract: The set of frequent closed patterns uniquely determines the complete set of all frequent patterns. According to the features of data streams, a new algorithm is proposed for mining the frequent closed patterns. The data streams are partitioned into a set of segments, and a DSFCI_tree is used to store the potential frequent closed patterns dynamically. With the arrival of each batch of data, the algorithm builds a corresponding local DSFCI_tree, then updates and prunes the global DSFCI_tree effectively to mine the frequent closed patterns in the entire data streams. The experiments and analysis show that the algorithm has good performance.
Key words:
data mining,
data streams,
association rule,
frequent closed itemsets
中图分类号:
程转流;胡学钢. 数据流中频繁闭合模式的挖掘[J]. 计算机工程, 2008, 34(16): 50-52.
CHENG Zhuan-liu; HU Xue-gang. Frequent Closed Patterns Mining over Data Streams[J]. Computer Engineering, 2008, 34(16): 50-52.