Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2006, Vol. 32 ›› Issue (21): 76-78.

• Software Technology and Database • Previous Articles     Next Articles

An Efficient Incremental Algorithm for Clustering Based on Density

LIU Jianye1,2, LI Fang1   

  1. (1. Dept. of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030; 2. Oracle China, Shanghai 200021)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-11-05 Published:2006-11-05

一种基于密度的高性能增量聚类算法

刘建晔1,2,李 芳1   

  1. (1. 上海交通大学计算机科学与工程系,上海 200030;2. 甲骨文公司(中国),上海 200021)

Abstract: An incremental algorithm of high efficiency for clustering based on density is presented. The main idea consists of following: (1) Sample data by using partition and sampling technology. (2) Clustering data based on density and grid. (3) In the case for threshold adjusting, it proposes an incremental algorithm to recalculate data affected only. (4) After data insertion or deletion in dynamic environment, making use of incremental algorithm to re-cluster data. The experiments show that the new algorithm can efficiently process high dimensional data with noise and speed up mining greatly.


Key words: Data mining, Clustering algorithm, Density, Incremental algorithm

摘要: 提出并证明了一种基于密度的高性能增量聚类算法,算法的主要工作包括:(1)利用分区和抽样技术对数据进行抽取和清理。(2)利用密度和网格技术对数据进行聚类。(3)改变阈值后提出一种增量算法,只对受影响的点重新计算聚类。(4)在动态环境下,数据增删后的增量聚类算法。实验证明,该算法能很好地处理高维数据,有效过滤噪声数据,大大节省聚类时间。

关键词: 数据挖掘, 聚类算法, 密度, 增量算法

CLC Number: