作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (11): 178-186. doi: 10.19678/j.issn.1000-3428.0066558

• 网络空间安全 • 上一篇    下一篇

基于CNN CBAM-BiGRU Attention的加密恶意流量识别

邓昕1, 刘朝晖1,2,*, 欧阳燕2, 陈建华1   

  1. 1. 南华大学 计算机学院, 湖南 衡阳 421001
    2. 南华大学 创新创业学院, 湖南 衡阳 421001
  • 收稿日期:2022-12-19 出版日期:2023-11-15 发布日期:2023-03-10
  • 通讯作者: 刘朝晖
  • 作者简介:

    邓昕(1998—),男,硕士研究生,主研方向为网络安全

    欧阳燕,硕士

    陈建华,硕士研究生

  • 基金资助:
    中国科学院网络评测技术重点实验室开放课题基金(kfkt2019-007); 国家重点研发项目;湖南省教育厅科学研究项目(20C1632)

Encrypted Malicious Traffic Identification Based on CNN CBAM-BiGRU Attention

Xin DENG1, Zhaohui LIU1,2,*, Yan OUYANG2, Jianhua CHEN1   

  1. 1. School of Computer, University of South China, Hengyang 421001, Hunan, China
    2. College of Innovation and Entrepreneurship, University of South China, Hengyang 421001, Hunan, China
  • Received:2022-12-19 Online:2023-11-15 Published:2023-03-10
  • Contact: Zhaohui LIU

摘要:

对网络流量进行加密有助于保护数据安全和用户隐私,但是加密也隐藏了数据的特征,提高了恶意流量识别的难度。针对传统机器学习方法依赖专家经验、现有深度学习方法对加密流量特征表征能力不足等问题,提出一种在不解密的前提下自动提取空间特征和时序特征以进行加密恶意流量识别的CNN CBAM-BiGRU Attention模型。该模型分为空间特征提取与时序特征提取两部分:空间特征提取选用不同大小的一维卷积核,为了防止空间特征丢失,修改卷积层参数代替池化层进行特征压缩和去除冗余,再利用CBAM块对提取到的不同尺寸的空间特征进行加权,使得模型能够关注到区分度高的空间特征;时序特征提取部分利用双向门控循环单元来表征数据包之间的时序依赖关系,然后利用Attention来突出会话中重要的数据包。在此基础上,将两部分特征向量进行融合,利用Softmax分类器进行二分类和多分类。在公开数据集上进行实验,结果表明,该模型在二分类任务中的加密恶意流量识别准确率达到99.95%,在多分类任务中整体准确率达到99.39%,在Dridex与Zbot类别的加密恶意流量识别中F1值相比1D_CNN、BiGRU等模型有显著提高。

关键词: 网络安全, 加密恶意流量识别, 卷积神经网络, CBAM机制, 门控循环单元

Abstract:

Encrypting network traffic helps protect data security and user privacy; however, encryption also hides the characteristics of the data, making it difficult to identify malicious traffic. To address the problem of reliance on expert experience in traditional machine learning methods and insufficient representation of traffic in existing deep learning methods, this paper proposes a CNN CBAM-BiGRU Attention model to automatically extract spatial and temporal features without decryption, thereby enhancing the characterization of encrypted traffic features. The model is divided into two parts: spatial and temporal feature extraction. The spatial features are extracted by one-dimensional convolution kernels of different sizes. To prevent loss of spatial features, the parameters of the convolutional layer are modified to replace the feature compression and redundancy removal of the pooling layer, and CBAM is used to weight the extracted spatial features of different scales, so that the model can focus on spatial features with high differentiation. The time sequence feature selects the BiGRU to characterize the timing dependencies between data packets, whereby Attention is used to strengthen the role of important data packets. Finally, the two feature vectors are fused, and the Softmax classifier is used for binary classification as well as multi-classification. In the experiments conducted on public datasets, the proposed model achieved an accuracy of 99.95% in identifying encrypted malicious traffic in binary classification tasks, and an overall accuracy of 99.39% in multi-classification tasks. The F1 scores for encrypted malicious traffic in the Dridex and Zbot categories were significantly improved compared to those of 1D_CNN and BiGRU models.

Key words: cyber security, encrypted malicious traffic identification, Convolutional Neural Network(CNN), CBAM mechanism, Gated Recurrent Unit(GRU)