作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (9): 65-67. doi: 10.3969/j.issn.1000-3428.2010.09.022

• 软件技术与数据库 • 上一篇    下一篇

基于子空间维度加权的密度聚类算法

黄王非1,陈黎飞2,姜青山1,3   

  1. (1. 厦门大学软件学院,厦门 361005;2. 福建师范大学数学与计算机科学学院,福州 360108;3. 成都大学,成都 610106)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-05-05 发布日期:2010-05-05

Density Clustering Algorithm Based on Subspace Dimensional Weighting

HUANG Wang-fei1, CHEN Li-fei2, JIANG Qing-shan1,3   

  1. (1. School of Software, Xiamen University, Xiamen 361005; 2. School of Mathematics and Computer Science, Fujian Normal University, Fuzhou 360108; 3. Chengdu University, Chengdu 610106)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-05-05 Published:2010-05-05

摘要:

在高维数据聚类中,受维度效应的影响,现有的算法聚类效果不佳。为此,提出一种适用于高维数据的密度聚类算法StaDeCon。在经典的PreDeCon算法基础上,引入子空间维度权重的计算方法,避免PreDeCon算法使用全空间距离度量带来的问题,提高了聚类的质量。在合成数据和实际应用数据集上的实验结果表明,该算法在高维数据聚类上可取得较好的聚类精度,算法是有效可行的。

关键词: 聚类, 高维数据, 子空间, 维度加权

Abstract: In clustering of high dimensional data, most of the existing algorithms can not reach people’s expectation due to the curse of dimensionality. Based on the classic PreDeCon algorithm, this paper presents the StaDeCon, a density clustering algorithm for high dimensional data, which introduces a measure of subspace dimensional weighting to avoid the problem existing in PreDeCon caused by using full dimensional distance, and in this way, the quality of clustering is improved. Experimental results both on artificial and practical data show that the algorithm is more accurate, and it is effective and feasible.

Key words: clustering, high dimensional data, subspace, dimensional weighting

中图分类号: