作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (19): 56-58. doi: 10.3969/j.issn.1000-3428.2010.19.019

• 软件技术与数据库 • 上一篇    下一篇

基于中心点及密度的分布式聚类算法

冯少荣,张东站   

  1. (厦门大学信息科学与技术学院,福建 厦门 361005)
  • 出版日期:2010-10-05 发布日期:2010-09-27
  • 作者简介:冯少荣(1964-),男,副教授、博士,主研方向:并行分布式数据库,数据仓库,数据挖掘;张东站,副教授、博士
  • 基金资助:
    国家自然科学基金资助项目(50604012)

Distributed Clustering Algorithm Based on Centers and Density

FENG Shao-rong, ZHANG Dong-zhan   

  1. (School of Information Science and Technology, Xiamen University, Xiamen 361005, China)
  • Online:2010-10-05 Published:2010-09-27

摘要: 针对分布式聚类算法DBDC存在的不足,提出一种基于中心点及密度的分布式聚类算法DCUCD。将数据分布计算出的虚拟点作为核心对象,核心对象的代表性随算法的执行次数提高,聚类即是对所有核心对象分类的过程。理论分析和实验结果表明,该算法能有效处理噪声和分布不规则的数据点,时间效率和聚类质量较好。

关键词: 数据挖掘, 分布式聚类, 中心点, 噪声

Abstract: In order to overcome the shortcomings of the DBDC, a distributed clustering based on centers and density which called DCUCD is proposed. It works based on the centers and the density. The virtual core objects are generated from the distributed data and the quality is better if the algorithm runs more times. Clustering is the same as the process to classify all of the core objects. Theoretical analysis and experimental results testify that DCUCD can effectively deal with the problem of local noise, and discover clusters of arbitrary shape. It can generate high quality clusters and cost a little time.

Key words: data mining, distributed clustering, centers, noise

中图分类号: