计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于层次与密度的任意形状聚类算法

许合利,牛丽君   

  1. (河南理工大学 计算机科学与技术学院,河南 焦作 454000)
  • 收稿日期:2015-09-28 出版日期:2016-07-15 发布日期:2016-07-15
  • 作者简介:许合利(1963-),男,教授,主研方向为数据挖掘、网络数据库技术;牛丽君,硕士研究生。
  • 基金项目:
    国家自然科学基金资助项目(61202286);国家科技重大专项基金资助项目(2014ZX01045-102)。

Arbitrary Shape Clustering Algorithm Based on Hierarchy and Density

XU Heli,NIU Lijun   

  1. (School of Computer Science and Technology,Henan Polytechnic University,Jiaozuo,Henan 454000,China)
  • Received:2015-09-28 Online:2016-07-15 Published:2016-07-15

摘要: 结合层次聚类算法和密度聚类算法,提出一种新的任意形状聚类算法,以密度峰值点为初始聚类中心将数据集划分为大量子簇。根据聚类合并准则,对簇间边界区域密度大于等于其中任何一个簇平均密度的相邻子簇进行合并。通过动态建模方法进行子簇合并,无需人工输入终止参数即可自动确定聚类终止点。在测试数据集和真实数据集上的实验结果表明,该算法对输入参数的选择具有鲁棒性,能有效识别任意形状、大小和密度的聚类,并且适用于密度分布不均匀的数据集。

关键词: 层次聚类算法, 密度聚类算法, 任意形状聚类, 动态模型, 边界区域密度, 密度峰值点

Abstract: Based on the hierarchical clustering algorithm and density clustering algorithm,a novel arbitrary shape clustering algorithm is proposed.In this algorithm,the dataset is divided into a large number of initial sub-clusters by finding the initial clustering center which is the density peak point.According to the clustering merging criterion,two sub-clusters are merged if their border density is greater than any one of the cluster average density.The algorithm uses the method of dynamic modeling to merge the sub-clusters,which can automatically determine the clustering termination point without the artificial input parameters.Experimental results on the test datasets and real datasets illustrate that this algorithm is robust to the choice of input parameters,and can effectively identify the clustering of arbitrary shape,size and density.It is suitable for the dataset with uneven density distribution.

Key words: hierarchical clustering algorithm, density clustering algorithm, arbitrary shape clustering, dynamic model, border region density, density peak point

中图分类号: