摘要: 关联规则算法中的数据通常采用水平数据形式,而采用垂直数据表示的挖掘性能优于水平表示。Eclat算法在项集规模庞大时,交集操作消耗大量时间和系统内存。为此,结合划分思想和突出基于概率的先验约束方法,把数据库中的事务划分成多个非重叠部分,对每一部分采用Eclat算法,减少每次“交”操作时项集的规模,从而减少比较次数。通过基于概率的先验约束,减少产生的局部频繁项集数。实验结果表明,改进算法比原算法具有更高的效率。
关键词:
关联规则,
Eclat算法,
划分,
概率先验
Abstract: Although level transaction database is adopted in present association rules mining algorithms, plump transaction database has advantage over level transaction database. While Eclat algorithm shows when the Tidsets are very large, this step consumes a lot of time and memories. Contra posed this fault, a new improvement algorithm——Declat is presented. The algorithm applies the method of division to Eclat, reduces the Tidset’s quantity when operate intersects; proposes a priority constraint, reduces the local frequent itemsets’ quantity. Experimental result shows that the improved algorithm has ligher efficiency than the Eclat algorithm.
Key words:
association rules,
Eclat algorithm,
division,
probability priority
中图分类号:
张玉芳, 熊忠阳, 耿晓斐, 陈剑敏. Eclat算法的分析及改进[J]. 计算机工程, 2010, 36(23): 28-30.
ZHANG Yu-Fang, XIONG Zhong-Yang, GENG Xiao-Fei, CHEN Jian-Min. Analysis and Improvement of Eclat Algorithm[J]. Computer Engineering, 2010, 36(23): 28-30.