计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

鲁棒自表达的低秩属性选择算法

胡荣耀 1,刘星毅 2,程德波 1,何威 1,罗 1   

  1. (1.广西师范大学 广西多源信息挖掘与安全重点实验室,广西 桂林 541004; 2.广西钦州学院,广西 钦州 535000)
  • 收稿日期:2016-06-28 出版日期:2017-09-15 发布日期:2017-09-15
  • 作者简介:胡荣耀(1992—),男,硕士,主研方向为数据挖掘、机器学习;刘星毅,副教授、硕士;程德波、何威、罗,硕士。
  • 基金项目:
    国家自然科学基金(61263035,61573270);中国博士后科学基金(2015M570837);广西自然科学基金(2015GXNSFCB139011);广西研究生教育创新计划项目(YCSZ2016046)。

Robust Low-rank Self-representation Feature Selection Algorithm

HU Rongyao 1,LIU Xingyi 2,CHENG Debo 1,HE Wei 1,LUO Yan 1   

  1. (1.Guangxi Key Lab of Multi-source Information Mining and Security,Guangxi Normal University,Guilin,Guangxi 541004,China; 2.Qinzhou University,Qinzhou,Guangxi 535000,China)
  • Received:2016-06-28 Online:2017-09-15 Published:2017-09-15

摘要: 针对无监督属性选择算法无类别信息和未考虑属性的低秩问题,提出一种基于自表达方法的低秩属性选择算法。在损失函数中使用低秩和自表达方法描述属性间的相关结构,利用K均值聚类算法得到所有样本的伪类标签进行属性选择,采用稀疏学习方法中的l2,p-范数参数p控制属性选择结果的稀疏性,并通过子空间学习方法使属性选择结果达到全局最优。实验结果表明,与无监督属性选择算法相比,该算法在6个公开数据集上均具有较高的分类准确率及稳定性。

关键词: 属性选择, 子空间学习, K均值聚类, 低秩约束, 稀疏学习

Abstract: Since unsupervised feature selection algorithms do not have label information and also ignore the low-rank characteristics of the data,this paper proposes a new low-rank feature selection algorithm based on self-representation method.In the loss function,low rank and self-representation methods are used to describe the correlation structure between features,and the K-means clustering method is used to obtain the pseudo labels of samples to realize feature selection.Then,l2,p-norm parameter p in sparse learning method is adopted to control the sparsity of feature selection results.Through subspace learning method,the result of feature selection is globally optimal.The experimental results on six public datasets demonstrate that the proposed feature selection algorithm has higher classification accuracy and better stability compared with the unsupervised feature selection algorithm.

Key words: feature selection, subspace learning, K-means clustering, low-rank constraint, sparse learning

中图分类号: