摘要: 为对多密度数据集聚类,提出一种基于密度可达的多密度聚类算法。使用网格划分技术来提高计算每个点密度值的效率,每次聚类都是从最高密度点开始,根据密度可达的概念和广度优先的策略逐步向外扩展进行聚类。实验表明,该算法能够有效地对任意形状、大小的均匀数据集和多密度数据集进行聚类,并能较好地识别出孤立点和噪声,其精度和效率优于SNN算法。
关键词:
聚类算法,
邻域网格,
密度可达,
广度优先,
多密度
Abstract: In order to cluster multi-density dataset, a clustering algorithm based on density-reachable for multi-density is proposed. Grid partition method is used to improve efficiency when computing each point’s density. A clustering starts with the highest density point and uses expansion to form a cluster based on density-reachable and breadth-first strategy. Experimental results show that this algorithm can effectively discover clusters of arbitrary shapes for multi-density and uniformity density data sets with noises. It can get good cluster quality and is more efficient than SNN algorithm.
Key words:
clustering algorithm,
neighborhood grid,
density-reachable,
breadth-first,
multi-density
中图分类号:
薛丽香;邱保志. 基于密度可达的多密度聚类算法[J]. 计算机工程, 2009, 35(17): 66-68.
XUE Li-xiang; QIU Bao-zhi. Density-reachable Based Clustering Algorithm for Multi-density[J]. Computer Engineering, 2009, 35(17): 66-68.