Abstract:
Considering deficiencies of some popular data stream clustering algorithms, a data stream clustering algorithm based on density and fractal dimension is presented. It consists of two phases of online and offline processing, combined with the advantages of density clustering and fractal clustering. The deficiency of the traditional clustering algorithm is overcome. In the algorithm, a density decaying strategy to reflect the timelines of data stream is adopted. Experimental results show the algorithm improves the efficiency and accuracy of data stream clustering, and can find arbitrary shapes and non-neighboring clusters.
Key words:
data stream,
clustering,
fractal dimension,
attenuation coefficient,
grid,
grid density
摘要: 提出一种基于密度与分形维数的数据流聚类算法。采用在线/离线的两阶段框架,结合密度聚类和分形聚类的优点,克服传统数据流聚类算法的不足。针对数据流的时效性,在计算网格密度时对数据点使用衰减策略。实验结果表明,该算法能有效提高数据流聚类效率及聚类精度,且可以发现任意形状和距离非邻近的聚类。
关键词:
数据流,
聚类,
分形维数,
衰减系数,
网格,
网格密度
CLC Number:
JIN Jian-Ye, NI Zhi-Wei, HONG Sha. Data Stream Clustering Algorithm Based on Density and Fractal Dimension[J]. Computer Engineering, 2012, 38(5): 38-40.
金建业, 倪志伟, 汪莎. 基于密度与分形维数的数据流聚类算法[J]. 计算机工程, 2012, 38(5): 38-40.