作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (11): 72-73,7. doi: 10.3969/j.issn.1000-3428.2008.11.026

• 软件技术与数据库 • 上一篇    下一篇

基于邻接图的离群数据聚类算法

金义富1,2,朱庆生2,邹咸林2

  

  1. (1. 湛江师范学院信息学院,湛江 524048;2. 重庆大学计算机学院,重庆 400044)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-06-05 发布日期:2008-06-05

Clustering Algorithm of Outliers Based on Adjacency Graph

JIN Yi-fu1,2, ZHU Qing-sheng2, ZOU Xian-lin2   

  1. (1. School of Information, Zhanjiang Normal College, Zhanjiang 524048; 2. College of Computer, Chongqing University, Chongqing 400044)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-06-05 Published:2008-06-05

摘要: 离群数据是数据中的小模式,因其固有的少数据与稀疏性等特征,使得基于距离或基于统计等常规聚类方式不适用于对离群数据的分类。该文根据离群对象关键域子空间的重合度,定义了离群共享属性集与离群相似度等概念,提出-离群簇分析技术。通过构建离群邻接图并将其稀疏化,将-离群簇搜索与相应的离群邻接图的最大完全子图搜索一一对应,给出一种基于邻接图的离群数据聚类算法。算例及实验结果表明,该方法具有较高的效率及良好的直观性。

关键词: 离群数据, 关键域子空间, 离群邻接图, 聚类算法

Abstract: Outliers are small pattern in data space. General clustering approaches, such as distance-based and statistics-based, are not adapted to classification of outliers because of their characteristic of fewness data and sparseness. This paper defines concepts of outlying shared attribute and outlying similarity based on the key attribute subspace of an outlier and proposes an analysis technique on -cluster of outliers. An algorithm for clustering of outliers based on adjacency graph is put forward in this paper. Its main idea includes establishment and simplification of outlying adjacency graph in which a maximum complete subgraph is corresponding with a -cluster of outliers. Examples and experimental results show that the algorithm is intuitionistic and well efficient.

Key words: outliers, key attribute subspace, outlying adjacency graph, clustering algorithm

中图分类号: