Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2007, Vol. 33 ›› Issue (04): 184-186. doi: 10.3969/j.issn.1000-3428.2007.04.064

• Artificial Intelligence and Recognition Technology • Previous Articles     Next Articles

A Feature Selection Method Fitting for Large Data Set

ZHANG Li, CHEN Gonghe   

  1. (School of Information Technology & Management Engineering, University of International Business and Economics, Beijing 100029)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-02-20 Published:2007-02-20

一种适合大规模数据集的特征选择方法

张 莉,陈恭和   

  1. (对外经济贸易大学信息技术与管理工程学院,北京 100029)

Abstract: This paper researches on problems of selecting important features and proposes a feature selection method fitting for large data set, selects feature subset using feature similarity, the idea of floating search method, and classifiers with the help of mutual information and accuracy weight, and propose a Bagging-based selective result ensemble algorithm to improve the algorithm stability. Intrusion detection data of KDD Cup’99 to validate the performance of algorithm is introduced.

Key words: Feature selection, Feature similarity, Floating search, Selective ensemble

摘要: 研究训练样本重要特征选择问题,提出了一种适合大规模数据集的特征选择方法。在不同的样本空间中利用特征相似性和浮动搜索方法的思想选择特征,基于互信息和分类准确度加权选择分类器,提出了基于Bagging选择性组合算法来提高特征选择算法稳定性。采用KDD Cup’99中的入侵检测数据对算法性能进行了验证。

关键词: 特征选择, 特征相似性, 浮动搜索, 选择性集成