摘要: 针对网络流量数据大、动态变化性高的问题,提出一种基于数据流挖掘技术——概念自适应快速决策树(CVFDT)的网络流量识别方法。CVFDT适合处理流动数据,随数据样本分布的变化更新模型,并能处理概念漂移。在具有12个最优属性特征的网络流数据集上进行实验,结果表明,与朴素贝叶斯方法相比,CVFDT方法具有较好的分类效果和稳定性。
关键词:
流量分类,
应用识别,
概念自适应快速决策树,
数据流挖掘
Abstract: Considering Internet data stream dynamically in large volumes, this paper proposes a traffic classification method using data stream mining techniques, named Concept-adapting Very Fast Decision Tree(CVFDT). CVFDT is capable of processing dynamic datasets, coping with concept drift and updating the model catering to incoming data. The approach and naive Bayes method on network traffic data stream sets are tested, which has 12 significant attributes. Experimental result shows that the approach gets high performance on classification accuracy and spatial stability compared with naive Bayes method.
Key words:
traffic classification,
application identification,
Concept-adapting Very Fast Decision Tree(CVFDT),
data stream mining
中图分类号:
朱欣, 赵雷, 杨季文. 基于CVFDT的网络流量分类方法[J]. 计算机工程, 2011, 37(12): 101-103.
SHU Xin, DIAO Lei, YANG Ji-Wen. Network Traffic Classification Method Based on Concept-adapting Very Fast Decision Tree[J]. Computer Engineering, 2011, 37(12): 101-103.