作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (6): 148-156. doi: 10.19678/j.issn.1000-3428.0068055

• 网络空间安全 • 上一篇    下一篇

基于分割点改进孤立森林的网络入侵检测方法

余长宏, 许孔豪, 张泽, 高明   

  1. 浙江工商大学信息与电子工程学院, 浙江 杭州 310000
  • 收稿日期:2023-07-11 修回日期:2023-11-02 发布日期:2023-12-19
  • 通讯作者: 余长宏,E-mail:yuch@mail.zjgsu.edu.cn E-mail:yuch@mail.zjgsu.edu.cn
  • 基金资助:
    国家自然科学基金(61871468); 浙江省重点研发计划(2017C01G2050953)。

Improving Network Intrusion Detection Methods in Isolated Forests Based on Split Points

YU Changhong, XU Konghao, ZHANG Ze, GAO Ming   

  1. School of Information and Electronic Engineering, Zhejiang Gongshang University, Hangzhou 310000, Zhejiang, China
  • Received:2023-07-11 Revised:2023-11-02 Published:2023-12-19

摘要: 随着网络攻击的不断增多和日益复杂化,传统基于监督的网络入侵检测算法不能准确识别没有类别标记或特征不明显的网络访问链接,而对于无监督的网络入侵检测算法,也存在检测效率和准确率低等问题。针对如何进一步提升网络入侵检测性能,提出使用自编码器(AE)与分割点改进孤立森林模型对网络入侵进行检测。首先,对无监督自编码器进行L1正则化,以增强自编码器的稀疏性,通过学习数据内在结构,自适应地提取具有判别性的特征,完成入侵攻击的特征提取;然后,使用改进的孤立森林分离异常点,即使用最大化均值与标准差之商来确定分割点划分最佳超平面来构建隔离树,使隔离树在相关子空间中具有更强隔离异常值的能力,并通过遍历所有隔离树中数据点的平均路径长度得到异常得分来判定异常流量。在KDDCUP99和UNSW-NB15数据集上的实验结果表明,与6种传统无监督方法相比,该方法较传统孤立森林准确率和召回率均提升约20%,F1值和曲线下面积(AUC)值均提升约10%,较其他无监督方法相比大幅降低了误码率。

关键词: 网络入侵检测, 稀疏自编码器, 孤立森林, 无监督学习, 隔离树

Abstract: With the continuous increase in the complexity of network attacks, traditional supervised-based network intrusion detection algorithms struggle to accurately identify network access connections without category labels or with inconspicuous features. Additionally, unsupervised network intrusion detection algorithms face challenges such as low detection efficiency and accuracy. To enhance the performance of network intrusion detection further, this study employs an Auto-Encoders(AE) combined with a split point-improved isolation forest model for detecting network intrusions. First, L1 regularization is applied to unsupervised AE to enhance their sparsity. By learning the intrinsic structure of the data, discriminative features are adaptively extracted for intrusion attack feature extraction. Thereafter, the improved isolation forest is employed to separate the anomalous points and determine the optimal hyperplane for partitioning based on the ratio of the maximum mean to the standard deviation to construct the isolation tree. This endows the isolation tree with a strong ability to isolate exceptional values from relevant subspaces. Anomaly traffic is determined by calculating the average path length of all data points in the isolation trees. The proposed approach is tested on the KDDCUP99 and UNSW-NB15 datasets, and compared with six traditional unsupervised methods. The results indicate that the proposed method improves accuracy and recall rates by approximately 20% compared to traditional Isolation Forest, and enhances F1 and Area Under Curve(AUC) values by approximately 10%. Moreover, it significantly reduces the misclassification rate compared with other unsupervised methods.

Key words: network intrusion detection, Sparse Auto-Encoder(SAE), isolated forests, unsupervised learning, isolate tree

中图分类号: