作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (4): 123-128,134. doi: 10.19678/j.issn.1000-3428.0054476

• 网络空间安全 • 上一篇    下一篇

基于特征分组聚类的异常入侵检测系统研究

何发镁1,2, 马慧珍2,3, 王旭仁2,3, 冯安然2,3   

  1. 1. 北京理工大学 图书馆, 北京 100081;
    2. 中国科学院信息工程研究所 中国科学院网络测评技术重点实验室, 北京 100093;
    3. 首都师范大学 信息工程学院, 北京 100048
  • 收稿日期:2019-04-03 修回日期:2019-05-11 出版日期:2020-04-15 发布日期:2019-05-23
  • 作者简介:何发镁(1972-),男,博士,主研方向为数据挖掘、情报分析;马慧珍,硕士研究生;王旭仁,副教授、博士;冯安然,硕士研究生。
  • 基金资助:
    国家自然科学基金(61872252)。

Research on Anomaly Intrusion Detection System Based on Feature Grouping Clustering

HE Famei1,2, MA Huizhen2,3, WANG Xuren2,3, FENG Anran2,3   

  1. 1. Library, Beijing Institute of Technology, Beijing 100081, China;
    2. Key Laboratory of Network Assessment Technology, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;
    3. Information Engineering College, Capital Normal University, Beijing 100048, China
  • Received:2019-04-03 Revised:2019-05-11 Online:2020-04-15 Published:2019-05-23

摘要: 利用网络连接数据可以按照连接的基本特征、内容特征、网络流量特征和主机流量特征进行分组的特点,基于K-means算法,提出一种按照特征分组进行聚类的方法,以高效实现特征约简和数据降维。通过调整聚类参数保留特征分组内的差异信息,使用决策树C4.5算法对降维后的数据进行入侵分类处理。实验结果表明,该方法能够使kddcup99数据集的聚类特征数由41个降为4个,且对网络连接数据的总检测率为99.73%,误检率为0,其中正常网络连接和刺探攻击Probe的检测率均为100%。

关键词: 入侵检测, 网络数据, K-means算法, 决策树, 数据降维

Abstract: The network connection data can execute feature grouping according to the basic features of connection,the content features,the network traffic features and the host features.Taking advantage of this characteristic,this paper proposes a K-means based clustering method according to the grouping of features,so as to effectively achieve feature reduction and data dimensionality reduction.The differential information within the feature groups are retained by adjusting clustering parameters,and the decision tree C4.5 algorithm is used for intrusion classification of the data after dimensionality reduction.Experimental results show that the proposed method can reduce the number of clustering features of kddcup99 dataset from 41 to 4.The overall detection rate on network connection data is 99.73%,the false detection rate is 0 and the detection rates of normal network connection and Probe attack are both 100%.

Key words: intrusion detection, network data, K-means algorithm, decision tree, data dimensionality reduction

中图分类号: