作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (6): 108-114. doi: 10.19678/j.issn.1000-3428.0054930

• 人工智能与模式识别 • 上一篇    下一篇

基于自然近邻的自适应关联融合聚类算法

李萍, 龚晓峰, 雒瑞森   

  1. 四川大学 电气信息学院, 成都 610065
  • 收稿日期:2019-05-16 修回日期:2019-06-18 发布日期:2019-06-29
  • 作者简介:李萍(1994-),女,硕士研究生,主研方向为模式识别;龚晓峰,教授;雒瑞森,讲师、博士、博士后。
  • 基金资助:
    中国博士后基金(2017M612958)。

Adaptive Correlation Fusion Clustering Algorithm Based on Natural Neighbor

LI Ping, GONG Xiaofeng, LUO Ruisen   

  1. College of Electrical Engineering and Information Technology, Sichuan University, Chengdu 610065, China
  • Received:2019-05-16 Revised:2019-06-18 Published:2019-06-29

摘要: 为解决传统聚类算法多数需要预先设定聚类参数且无法有效识别异常点和噪声点的问题,提出一种自适应的关联融合聚类算法。采用自然近邻搜索算法计算数据集的密度分布,筛选出具有数据结构信息的代表核点,并排除边界点和噪声点对聚类结果的影响。引入关联度矩阵,通过计算类簇间的关联程度和融合度量,选取最优关联簇进行融合得到最终聚类结果。实验结果表明,该算法无需人工设置聚类参数,并且与基于密度的空间聚类算法和K均值聚类算法相比,其具有更高的聚类准确率和可靠性。

关键词: 自然近邻, 无尺度邻域, 代表核点, 融合度量, 密度层次

Abstract: Most traditional clustering algorithms need to pre-set clustering parameters and fail to recognize outliers and noise.To address the problem,this paper proposes an adaptive correlation fusion clustering algorithm.The algorithm uses the narual neighbor search algorithm to calculate the density distribution of datasets,and screens out representative kernels with data structure information.The influence of boundary points and noise on clustering results is ruled out.Then the algorithm introduces correlation matrix.By calculating the correlation degree and fusion measurement between clusters,the optimal correlation clusters are selected for fusion to obtain the final clustering result.Experimental results show that compared with Density-Based Spatial Clustering of Applications with Noise(DBSCAN) algorithm and K-means clustering algorithm,the proposed algorithm does not need to manually set clustering parameters,and it has higher clustering accuracy and reliability.

Key words: natural neighbor, scale-free neighborhood, representative kernel point, fusion measurement, density level

中图分类号: