摘要: 研究训练样本重要特征选择问题,提出了一种适合大规模数据集的特征选择方法。在不同的样本空间中利用特征相似性和浮动搜索方法的思想选择特征,基于互信息和分类准确度加权选择分类器,提出了基于Bagging选择性组合算法来提高特征选择算法稳定性。采用KDD Cup’99中的入侵检测数据对算法性能进行了验证。
关键词:
特征选择,
特征相似性,
浮动搜索,
选择性集成
Abstract: This paper researches on problems of selecting important features and proposes a feature selection method fitting for large data set, selects feature subset using feature similarity, the idea of floating search method, and classifiers with the help of mutual information and accuracy weight, and propose a Bagging-based selective result ensemble algorithm to improve the algorithm stability. Intrusion detection data of KDD Cup’99 to validate the performance of algorithm is introduced.
Key words:
Feature selection,
Feature similarity,
Floating search,
Selective ensemble
张 莉;陈恭和. 一种适合大规模数据集的特征选择方法[J]. 计算机工程, 2007, 33(04): 184-186.
ZHANG Li; CHEN Gonghe. A Feature Selection Method Fitting for Large Data Set[J]. Computer Engineering, 2007, 33(04): 184-186.