Abstract:
The current study is to use Machine Learning(ML) techniques and classify Internet traffic based on per-flow features. Since a lot flow features can be used for flow classification and there are many irrelevant and redundant features among them, feature selection plays a vital role in algorithm performance optimization. This paper uses two filter-based feature selection methods for classification algorithms such as C4.5, Bayesnet, NBD, NBK. Experimental results show the approach can improve computational performance without negative impact on classification accuracy.
Key words:
feature selection,
IP traffic classification,
Machine Learning(ML)
摘要: 基于流的特征并使用机器学习技术进行网络流量分类是目前网络流量分类的主流技术。由于许多流的特征可用于流分类,其中有许多是不相关和冗余的特征,因此特征选择对算法性能的优化具有重要的作用。将基于过滤的特征选择方法应用于C4.5、Bayesnet、NBD、NBK等分类算法,实验结果表明该方法在无损于分类准确性的同时能够改进计算性能。
关键词:
特征选择,
IP流量分类,
机器学习
CLC Number:
HUANG Jun-Yi, TUN Jing, ZHANG Hui. Analysis of Feature Selection Effect on IP Traffic Classification Algorithms[J]. Computer Engineering, 2010, 36(16): 68-70.
黄君毅, 吴静, 张晖. IP流量分类算法中特征选择作用分析[J]. 计算机工程, 2010, 36(16): 68-70.