Abstract:
An incremental algorithm of high efficiency for clustering based on density is presented. The main idea consists of following: (1) Sample data by using partition and sampling technology. (2) Clustering data based on density and grid. (3) In the case for threshold adjusting, it proposes an incremental algorithm to recalculate data affected only. (4) After data insertion or deletion in dynamic environment, making use of incremental algorithm to re-cluster data. The experiments show that the new algorithm can efficiently process high dimensional data with noise and speed up mining greatly.
Key words:
Data mining,
Clustering algorithm,
Density,
Incremental algorithm
摘要: 提出并证明了一种基于密度的高性能增量聚类算法,算法的主要工作包括:(1)利用分区和抽样技术对数据进行抽取和清理。(2)利用密度和网格技术对数据进行聚类。(3)改变阈值后提出一种增量算法,只对受影响的点重新计算聚类。(4)在动态环境下,数据增删后的增量聚类算法。实验证明,该算法能很好地处理高维数据,有效过滤噪声数据,大大节省聚类时间。
关键词:
数据挖掘,
聚类算法,
密度,
增量算法
CLC Number:
LIU Jianye; LI Fang. An Efficient Incremental Algorithm for Clustering Based on Density[J]. Computer Engineering, 2006, 32(21): 76-78.
刘建晔;李 芳.
一种基于密度的高性能增量聚类算法
[J]. 计算机工程, 2006, 32(21): 76-78.