摘要: 在机器学习中,信息冗余和无关特征会导致较高的计算复杂度以及过拟合问题。为此,提出一种基于联盟博弈的Filter特征选择算法。采用联合互信息度量联盟与目标类的依赖程度,使用Shapley权利指数评估每个特征在整个特征空间中的重要性,选择具有最高优先权的特征子集用于分类学习。实验结果表明,在C4.5和支持向量机2种分类器下,该算法特征子集分类准确率的均值分别为88.72%、93.39%,高于mRMR算法和ReliefF算法。
关键词:
机器学习,
维数灾难,
特征选择,
联盟博弈,
信息论,
联合互信息
Abstract: Information redundancy and independent feature can lead to higher computational complexity and over fitting problem in machine learning. A filter feature selection algorithm based on coalitional game is proposed in this paper. The joint mutual information is utilized to measure the relevance between the coalition and target class, and Shapley value is used to evaluate the importance of each feature among the feature space. Experimental results show that under two kinds of classifier such as C4.5 and Support Vector Machine(SVM), the subset mean classification accuracy of this algorithm are 88.72% and 93.39%, and is higher than mRMR algorithm and ReliefF algorithm.
Key words:
machine learning,
curse of dimensionality,
feature selection,
coalitional game,
information theory,
joint mutual information
中图分类号:
李智广, 付枫, 孙鑫, 李彩虹. 基于联盟博弈的Filter特征选择算法[J]. 计算机工程, 2013, 39(4): 230-233.
LI Zhi-An, FU Feng, SUN Xin, LI Cai-Gong. Filter Feature Selection Algorithm Based on Coalitional Game[J]. Computer Engineering, 2013, 39(4): 230-233.