作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

一种基于模糊核聚类的谱聚类算法

范子静 1,罗泽 1,马永征 2   

  1. (1.中国科学院计算机网络信息中心,北京 100190; 2.中国互联网络信息中心,北京 100190)
  • 收稿日期:2016-08-23 出版日期:2017-11-15 发布日期:2017-11-15
  • 作者简介:范子静(1990—),女,硕士研究生,主研方向为机器学习、数据挖掘;罗泽,研究员、博士生导师;马永征(通信作者),副研究员。
  • 基金资助:
    国家自然科学基金(61361126011)。

A Spectral Clustering Algorithm Based on Fuzzy Kernel Clustering

FAN Zijing 1,LUO Ze 1,MA Yongzheng 2   

  1. (1.Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China; 2.China Internet Network Information Center,Beijing 100190,China)
  • Received:2016-08-23 Online:2017-11-15 Published:2017-11-15

摘要: 谱聚类是对样本拉普拉斯矩阵的特征向量进行聚类,不局限于原始数据的分布形状,可收敛于全局最优解,但不能准确反映样本间的实际关系,而模糊核聚类可利用模糊数学理论确定样本间的模糊关系。为此,在调整相似度度量函数和距离度量函数的基础上,将模糊核聚类融合到谱聚类算法中,提出SC-KFCM算法,利用模糊划分改进谱聚类中的硬划分,根据特征向量间的相似性和关联程度建立模糊隶属关系并对样本进行聚类,从而弥补谱聚类中硬划分部分对聚类结果造成的影响。实验结果表明,SC-KFCM算法在不同分布特点及维数的数据集上均取得了较稳定的聚类结果和较高的聚类精度。

关键词: 聚类分析, 谱聚类, 距离度量, 模糊核聚类, 模糊集, 隶属度, 核函数

Abstract: Spectral clustering eigenvector of the Laplace matrix is not limited to the distribution shape of the original data and can converge to the global optimal solution,but it cannot accurately reflect the actual relationship between samples.However,fuzzy kernel clustering can use fuzzy mathematics theory to determine the fuzzy relations among samples.For this purpose,this paper merges the fuzzy kernel clustering into spectral clustering algorithm and puts forward SC-KFCM algorithm on the basis of the adjustment of a similarity measure function and distance measurement function,which can make up for the impact that hard part in spectral clustering brings to the clustering results.It uses fuzzy partition to improve hard part in spectral clustering through establishing fuzzy subordinate relations and utilizes the degree of similarity and correlation between eigenvector among clustering samples to improve the hard part in spectral clustering.The experimental results prove that SC-KFCM has more stable clustering results and higher clustering accuracy on data utilizes sets of different distribution characteristic and different dimensions.

Key words: clustering analysis, spectral clustering, distance measure, fuzzy kernel clustering, fuzzy set, membership, kernel function

中图分类号: