作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (16): 178-181. doi: 10.3969/j.issn.1000-3428.2012.16.046

• 人工智能及识别技术 • 上一篇    下一篇

基于Grassmann流形的多聚类特征选择

蔺广逢 a,朱 虹 b,范彩霞 a,张二虎 a,罗 磊 a   

  1. (西安理工大学 a. 印刷包装工程学院;b. 自动化与信息工程学院,西安 710048)
  • 收稿日期:2011-10-12 修回日期:2011-12-05 出版日期:2012-08-20 发布日期:2012-08-17
  • 作者简介:蔺广逢(1978-),男,讲师、博士研究生,主研方向:数字图像处理,模式识别;朱 虹,教授、博士生导师;范彩霞,讲师、博士研究生;张二虎,教授、博士生导师;罗 磊,工程师
  • 基金资助:
    国家自然科学基金资助项目(61073092);国家国际科技合作专项基金资助项目(2011DFR10480);陕西省教育厅自然科学专项基金资助项目(2010JK718)

Multi-cluster Feature Selection Based on Grassmann Manifold

LIN Guang-feng a, ZHU Hong b, FAN Cai-xia a, ZHANG Er-hu a, LUO Lei a   

  1. (a. Faculty of Printing and Packaging Engineering; b. Faculty of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China)
  • Received:2011-10-12 Revised:2011-12-05 Online:2012-08-20 Published:2012-08-17

摘要: 在无监督聚类特征选择过程中,局部欧氏度量可能置乱局部流形的拓扑结构,影响所选特征的聚类性能。为此,提出一种基于Grassmann流形的多聚类特征选择算法。利用局部主成分分析逼近数据点的切空间,获取局部数据的主要变化方向。根据切空间构造Grassmann流形,通过测地距保留局部数据的流形拓扑结构,以L1范数优化逼近流形拓扑,选择利于聚类的原本数据特征。实验结果验证了该算法的有效性。

关键词: 无监督聚类, 特征选择, Grassmann流形, 切空间, 子空间, 正则化

Abstract: In unsupervised feature selection for clustering, the local topology of spectral clustering is usually built by Euclidean distance, which can even scramble the local topology in the small local. The scrambling topology can degrade the performance of the clustering. In this paper, Grassmann Multi-cluster Feature Selection(MCFS) algorithm is proposed to solve the problem. The tangent space of the data is approximated by local principal component analysis, which represents the main variation direction of the local data and filters the influence of the scrambling points generated by Euclidean distance. Via constructing Grassmann manifold in the tangent space, the geodesic distance of Grassmann manifold can preserve the topology structure of the local data. The topology of the manifold is approximated by L1 norm optimization, and the feature subset of original features is selected. Experimental result proves the validity of this algorithm.

Key words: unsupervised clustering, feature selection, Grassmann manifold, tangent space, subspace, regularization

中图分类号: