作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (13): 200-201,. doi: 10.3969/j.issn.1000-3428.2007.13.068

• 人工智能及识别技术 • 上一篇    下一篇

改进的k-平均聚类算法研究

孙士保1,2,秦克云1   

  1. (1. 西南交通大学智能控制开发中心,成都 610031;2. 河南科技大学电子信息工程学院,洛阳 471003)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-07-05 发布日期:2007-07-05

Research on Modified k-means Data Cluster Algorithm

SUN Shibao1,2, QIN Keyun1   

  1. (1. Intelligent Control Development Center, Southwest Jiaotong University, Chengdu 610031; 2. Electronic Information Engineering College, Henan University of Science and Technology, Luoyang 471003)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-07-05 Published:2007-07-05

摘要: 聚类算法的好坏直接影响聚类的效果。该文讨论了经典的k-平均聚类算法,说明了它存在不能很好地处理符号数据和对噪声与孤立点数据敏感等不足,提出了一种基于加权改进的k-平均聚类算法,克服了k-平均聚类算法的缺点,并从理论上分析了该算法的复杂度。实验证明,用该方法实现的数据聚类与传统的基于平均值的方法相比较,能有效提高数据聚类效果。

关键词: 聚类算法, k-平均, 权, 聚类数据挖掘

Abstract: The method of data clustering will influence the effect of clustering directly. The algorithm of k-means is discussed, the shortages of this algorithm such as it can not deal with symbolic data and it is sensitive for data of isolation point and noise are demonstrated. A modified k-means clustering algorithm based on weights is put forward, it changes the shortcomings of k-means. Its complexity is analyzed from theoretical. The experiments show that, compared with traditional method based on means, the modified data clustering algorithm can improve the efficiency of data clustering.

Key words: cluster algorithm, k-means, weights, cluster data mining

中图分类号: