作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于词袋模型聚类的异常流量识别方法

马林进,万良,马绍菊,杨婷   

  1. (贵州大学 计算机科学与技术学院,贵阳 550025)
  • 收稿日期:2016-10-26 出版日期:2017-05-15 发布日期:2017-05-15
  • 作者简介:马林进(1991—),男,硕士研究生,主研方向为网络安全监测;万良,教授;马绍菊、杨婷,硕士研究生。
  • 基金资助:
    贵州省科学技术基金(贵黔合LH字[2014]7634号,黔科合J字[2012]2328号)。

Abnormal Traffic Identification Method Based on Bag of Words Model Clustering

MA Linjin,WAN Liang,MA Shaoju,YANG Ting   

  1. (College of Computer Science and Technology,Guizhou University,Guiyang 550025,China)
  • Received:2016-10-26 Online:2017-05-15 Published:2017-05-15

摘要: 针对现有异常流量检测方法的识别准确率低且快速识别需要确定阈值等问题,基于词袋模型聚类,提出一种改进的网络异常流量识别方法。通过对已有的异常流量和正常流量进行K-means均值聚类,得到网络流量中的流量关键点,将网络流量转化映射到相应流量关键点后建立直方图,并采用半监督学习方式对异常流量进行检测。实验结果表明,与基于朴素贝叶斯、支持向量机等的识别方法相比,该方法具有更好的异常流量识别效果。

关键词: 词袋模型, 机器学习, 聚类, 数据挖掘, 异常流量识别

Abstract: In view of the problem that the accuracy of abnormal traffic identification is low and fast identification is dependent on the threshold,an abnormal traffic identification method based on BoW(Bag of Words) model clustering is proposed.By means of K-means mean clustering for existing abnormal traffic and normal traffic,it finds the key points of network traffic.The original traffic is tranformed and mapped to the corresponding traffic critical points and then histogram is established.Abnormal traffic is detected by using semi-supervised learning.The experimental results show that this method has better recognition effect of abnormal traffic compared with identification method based on Naive Bayes(NB),Support Vector Machine(SVM) and others.

Key words: Bag of Words(BoW) model, machine learning, clustering, data mining, abnormal traffic identification

中图分类号: