作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (4): 133-142. doi: 10.19678/j.issn.1000-3428.0060949

• 网络空间安全 • 上一篇    下一篇

结合马氏距离与自编码器的网络流量异常检测方法

李贝贝1, 彭力1, 戴菲菲2   

  1. 1. 江南大学 物联网工程学院, 江苏无锡 214122;
    2. 台州市产品质量安全检测研究院, 浙江 台州 318000
  • 收稿日期:2021-02-26 修回日期:2021-04-01 发布日期:2021-04-23
  • 作者简介:李贝贝(1995—),男,硕士研究生,主研方向为异常数据检测、网络入侵检测;彭力,教授、博士生导师;戴菲菲,硕士研究生。
  • 基金资助:
    国家重点研发计划(2018YFD0400902);国家自然科学基金(61873112)。

Abnormal Network Traffic Detection Method Combining Mahalanobis Distance and Autoencoder

LI Beibei1, PENG Li1, DAI Feifei2   

  1. 1. School of IoT Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China;
    2. Taizhou Product Quality and Safety Testing Research Institute, Taizhou, Zhejiang 318000, China
  • Received:2021-02-26 Revised:2021-04-01 Published:2021-04-23

摘要: 当前网络流量数据规模较大且分布不均衡,传统网络流量异常检测方法检测准确率较低。提出一种结合马氏距离和自编码器的检测方法,使用马氏距离倒数及判别阈值快速检测部分正常数据以减少训练数据量,同时,在自编码器代价函数中添加马氏距离度量项以增强自编码器的特征提取能力。在此基础上,将自编码器与分类器相结合以解决网络参数初始化问题,并通过调整自编码神经网络交叉熵损失函数中各项的权重,提高自编码神经网络对数据分布不均衡数据集的训练效果。实验结果表明,该方法在CICIDS2017数据集、NSL-KDD数据集上的异常检测准确率分别高达97.60%、99.84%,在CICIDS2017数据集上的F1值为0.941 3,高于DNN、LSTM、C-LSTM等方法。

关键词: 网络流量异常检测, 神经网络, 马氏距离, 自编码器, 自编码神经网络

Abstract: The existing abnormal traffic detection methods are limited in the accuracy due to the large scale of network traffic data and its imbalanced distribution.To address the problem, a method combining Mahalanobis distance and autoencoder is proposed to detect abnormal network traffic.This method employs the reciprocal Mahalanobis distance and discriminant threshold to quickly identify part of the normal data, so the amount of network training can be reduced.At the same time, the Mahalanobis distance item is added to the autoencoder cost function to improve the autoencoder, enhancing its ability of feature extraction.On this basis, the autoencoder is combined with the classifier to solve the problem of network parameter initialization.Then by adjusting the weight of each item in the cross-entropy loss function of the autoencoder neural network, the training effect of imbalanced data sets for the autoencoder neural network is improved.The experimental results show that the proposed method exhibits an anomaly detection accuracy of 97.60% on the CICIDS2017 data set, and 99.84% on the NSL-KDD data set.It also displays the F1 score on the CICIDS2017 is 0.941 3, higher than DNN, LSTM, C-LSTM and other methods.

Key words: abnormal network traffic detection, neural network, Mahalanobis distance, autoencoder, autoencoder neural network

中图分类号: