作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (5): 38-40. doi: 10.3969/j.issn.1000-3428.2012.05.009

• 软件技术与数据库 • 上一篇    下一篇

基于密度与分形维数的数据流聚类算法

金建业a,b,倪志伟a,b,汪 莎a,b   

  1. (合肥工业大学 a. 管理学院;b. 过程优化与智能决策教育部重点实验室,合肥 230009)
  • 收稿日期:2011-08-30 出版日期:2012-03-05 发布日期:2012-03-05
  • 作者简介:金建业(1987-),男,硕士研究生,主研方向:数据流聚类;倪志伟,教授、博士生导师;汪 莎,硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(70871033, 70801025);国家“863”计划基金资助项目(2007AA04Z116)

Data Stream Clustering Algorithm Based on Density and Fractal Dimension

JIN Jian-ye   a,b, NI Zhi-wei   a,b, WANG Sha   a,b   

  1. (a. School of Management; b. Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China)
  • Received:2011-08-30 Online:2012-03-05 Published:2012-03-05

摘要: 提出一种基于密度与分形维数的数据流聚类算法。采用在线/离线的两阶段框架,结合密度聚类和分形聚类的优点,克服传统数据流聚类算法的不足。针对数据流的时效性,在计算网格密度时对数据点使用衰减策略。实验结果表明,该算法能有效提高数据流聚类效率及聚类精度,且可以发现任意形状和距离非邻近的聚类。

关键词: 数据流, 聚类, 分形维数, 衰减系数, 网格, 网格密度

Abstract: Considering deficiencies of some popular data stream clustering algorithms, a data stream clustering algorithm based on density and fractal dimension is presented. It consists of two phases of online and offline processing, combined with the advantages of density clustering and fractal clustering. The deficiency of the traditional clustering algorithm is overcome. In the algorithm, a density decaying strategy to reflect the timelines of data stream is adopted. Experimental results show the algorithm improves the efficiency and accuracy of data stream clustering, and can find arbitrary shapes and non-neighboring clusters.

Key words: data stream, clustering, fractal dimension, attenuation coefficient, grid, grid density

中图分类号: