计算机工程 ›› 2010, Vol. 36 ›› Issue (9): 62-64.doi: 10.3969/j.issn.1000-3428.2010.09.021

• 软件技术与数据库 • 上一篇    下一篇

基于数据流的概念聚类

史金成1,胡学钢2   

  1. (1. 铜陵学院数学与计算机科学系,铜陵 244000;2. 合肥工业大学计算机与信息学院,合肥 230009)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-05-05 发布日期:2010-05-05

Conceptual Clustering Based on Data Streams

SHI Jin-cheng1, HU Xue-gang2   

  1. (1. Department of Mathematics and Computer Science, Tongling College, Tongling 244000; 2. School of Computer & Information, Hefei University of Technology, Hefei 230009)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-05-05 Published:2010-05-05

摘要: 分析二部图的二元组和概念聚类问题之间的关系,在此基础上结合数据流的特点,提出一种适用于对象属性为布尔型的数据流概念聚类算法。将数据流分段,对每一批到来的数据流,生成局部的近似极大ε二元组集合,对全局的近似极大ε二元组集合进行更新,从而有效地对整个数据流进行聚类。实验结果表明,该算法具有良好的时间效率和空间效率。

关键词: 数据流, 概念聚类, 近似极大ε二元组

Abstract: Connections between bipartite graph’s bicliques and conceptual clustering are analysed. On the basis, according to the features of data streams, a new conceptual clustering algorithm is proposed to cluster data streams whose attributes are boolean. The data streams are partitioned into a set of segments, and with the arrival of each segment, a local set of approximate maximum ε-bicliques is generated, then the algorithm updates the global set effectively to cluster the entire data streams. Experimental results show that the algorithm has good time efficiency and space efficiency.

Key words: data streams, conceptual clustering, approximate maximum ε-bicliques

中图分类号: