计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于稀疏贝叶斯模型的特征选择

祝璞 a,b,c,黄章进 a,b,c   

  1. (中国科学技术大学 a.计算机科学与技术学院;b.安徽省计算与通信软件重点实验室; c.先进技术研究院,合肥 230027)
  • 收稿日期:2016-03-09 出版日期:2017-04-15 发布日期:2017-04-14
  • 作者简介:祝璞(1991—),男,硕士研究生,主研方向为机器学习;黄章进(通信作者),副教授。
  • 基金项目:
    安徽省自然科学基金(1408085MKL06);高等学校学科创新引智计划项目(B07033)。

Feature Selection Based on Sparse Bayesian Model

ZHU Pu  a,b,c,HUANG Zhangjin  a,b,c   

  1. (a.School of Computer Science and Technology; b.Anhui Province Key Laboratory of Computing and Communication Software;c.Institute of Advanced Technology,University of Science and Technology of China,Hefei 230027,China)
  • Received:2016-03-09 Online:2017-04-15 Published:2017-04-14

摘要: 通过采用稀疏贝叶斯推理方法,设计出可同时进行学习最优分类器与选取最优特征子集的特征选择概率分类向量机算法。该算法是对概率分类向量机特征选择的扩展,可提高其在高维数据集上的性能。通过选取零均值的高斯分布作为先验,在模型中起到正则项的作用,同时在核函数和特征中引入稀疏,得到泛化性更好的分类模型。在高维度和低维度数据集中的实验结果表明,该算法同时具有较好的分类和特征选择能力。

关键词: 机器学习, 核函数, 稀疏贝叶斯, 特征选择, 概率分类向量机, 自动相关性检测

Abstract: Through using sparse Bayesian inference thought,a Feature Selection Probabilistic Classification Vector Machine (FPCVM) is designed which can learn optimal classifier and automatically select the most relevant feature subset.FPCVM is an extension of Probabilistic Classification Vector Machine(PCVM),which improves the performance of PCVM on high dimension datasets.It uses zero-mean Gaussian distribution as priori to introduce sparseness both in kernel functions and feature space;these priors are preformed as regularization items in the likelihood function to acquire more generalized model.Experimental results on high dimension datasets and low dimension datasets show that the algorithm has better classification and feature selection.

Key words: machine learning, kernel function, sparse Bayesian, feature selection, Probabilistic Classification Vector Machine(PCVM), automatic relevance determination

中图分类号: