摘要: 仿射传播聚类算法快速、有效,可以解决大数据集的聚类问题,但当数据的聚类结构比较松散时,聚类准确性不高。该文提出了半监督的仿射传播聚类算法,在迭代过程中嵌入了有效性指标以监督和引导算法向最优聚类结果的方向运行。实验结果表明,该方法对于聚类结构比较紧密和松散的数据集,均可以给出较为准确的聚类结果。
关键词:
仿射传播聚类,
半监督聚类,
大数据集的聚类算法
Abstract: Affinity propagation clustering is an efficient and fast clustering algorithm, especially for large data sets, but its clustering quality is low when it is applied to a data set with loose cluster structures. This paper proposes semi-supervised affinity propagation, where cluster validity indices are embedded into iteration process of the algorithm to supervise and guide its running to an optimal clustering solution. The experimental results show that the algorithm gives accurate clustering results for data sets with compact and loose cluster structures.
Key words:
affinity propagation clustering,
semi-supervised clustering,
cluster algorithm for large data sets
中图分类号:
王开军;李 健;张军英;涂重阳. 半监督的仿射传播聚类[J]. 计算机工程, 2007, 33(23): 197-198,.
WANG Kai-jun; LI Jian; ZHANG Jun-ying; TU Chong-yang. Semi-supervised Affinity Propagation Clustering[J]. Computer Engineering, 2007, 33(23): 197-198,.