作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (01): 187-189. doi: 10.3969/j.issn.1000-3428.2007.01.065

• 人工智能及识别技术 • 上一篇    下一篇

基于最长频繁闭项集的聚类算法

张泽洪,张 伟   

  1. (江南大学信息工程学院,无锡 214122)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-01-05 发布日期:2007-01-05

Clustering Algorithm Based on the Longest Frequent Closed Itemsets

ZHANG Zehong, ZHANG Wei   

  1. (School of Information Engineering, Southern Yangtze University, Wuxi 214122)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-01-05 Published:2007-01-05

摘要: 针对许多算法不适合对分类数据进行聚类的特点,提出了一种基于最长频繁闭项集(LFCI)的聚类算法。使用改造后的频繁模式树,得到每个事务的LFCI,由于LFCI的两个重要属性,因此可以将LFCI作为该事务的描述,从而直接得到聚类结果。实验证明了该算法的有效性。

关键词: 分类数据, 聚类算法, 闭项集, 频繁模式树

Abstract: Many clustering algorithms are not appropriate for categorical data. Therefore, a clustering algorithm based on the longest frequent closed itemsets (LFCIs) is presented. With the adaptation of traditional frequent-pattern tree, the LFCI of each transaction is found out accordingly. Due to two aspects of important attribute of LFCI, it can be considered as the description of the corresponding transaction. As a result, the clusters derive from LFCIs directly without a large intermediate set of frequent itemsets. The experiment results demonstrate the feasibility and robustness of this method.

Key words: Categorical data, Clustering algorithm, Closed itemsets, Frequent-pattern tree