作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (02): 12-14. doi: 10.3969/j.issn.1000-3428.2007.02.005

• 博士论文 • 上一篇    下一篇

兴趣子空间挖掘算法在高维数据聚类中的应用

杨 颖1,2,韩忠明1,杨 磊3   

  1. (1. 东华大学计算机科学与技术学院,上海 200051;2. 广西大学计算机与信息工程学院,南宁 530004; 3. 广西计算中心,南宁 530022)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-01-20 发布日期:2007-01-20

Application of Interesting Subspace Mining Algorithm in High-dimensional Data Clustering

YANG Ying1,2, HAN Zhongming1, YANG Lei3   

  1. (1. College of Computer Science and Technology, University of Donghua, Shanghai 200051; 2. College of Computer & Information Engineering, University of Guangxi, Nanning 530004; 3. Guangxi Computer Center, Nanning 530022)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-01-20 Published:2007-01-20

摘要: 给出了兴趣子空间的定义,采用基于Chernoff-Hoeffding边界,带回溯的深度优先搜索算法来挖掘最大兴趣子空间,并运用高维真实数据和合成数据检验算法的有效性。高维数据的挖掘面临着数据分布的稀疏性和特征空间的相交性所带来的挑战。

关键词: 兴趣子空间, 高维数据, 聚类, 数据挖掘

Abstract: Based on Chernoff-Hoeffding bound, this paper adopts a novel mining algorithm of depth-first search with backtracking to mine interesting subspace, and testifies the effectiveness by using synthetic and real data. High-dimensional data mining faces the challengers of distributed data sparsity and overlapping feature subspace.

Key words: Interesting subspace, High-dimensional data, Clustering, Data mining