Abstract:
Many clustering algorithms are not appropriate for categorical data. Therefore, a clustering algorithm based on the longest frequent closed itemsets (LFCIs) is presented. With the adaptation of traditional frequent-pattern tree, the LFCI of each transaction is found out accordingly. Due to two aspects of important attribute of LFCI, it can be considered as the description of the corresponding transaction. As a result, the clusters derive from LFCIs directly without a large intermediate set of frequent itemsets. The experiment results demonstrate the feasibility and robustness of this method.
Key words:
Categorical data,
Clustering algorithm,
Closed itemsets,
Frequent-pattern tree
摘要: 针对许多算法不适合对分类数据进行聚类的特点,提出了一种基于最长频繁闭项集(LFCI)的聚类算法。使用改造后的频繁模式树,得到每个事务的LFCI,由于LFCI的两个重要属性,因此可以将LFCI作为该事务的描述,从而直接得到聚类结果。实验证明了该算法的有效性。
关键词:
分类数据,
聚类算法,
闭项集,
频繁模式树
ZHANG Zehong; ZHANG Wei. Clustering Algorithm Based on the Longest Frequent Closed Itemsets[J]. Computer Engineering, 2007, 33(01): 187-189.
张泽洪;张 伟. 基于最长频繁闭项集的聚类算法[J]. 计算机工程, 2007, 33(01): 187-189.