基于CNN CBAM-BiGRU Attention的加密恶意流量识别

doi:10.19678/j.issn.1000-3428.0066558

摘要/Abstract

摘要：

对网络流量进行加密有助于保护数据安全和用户隐私，但是加密也隐藏了数据的特征，提高了恶意流量识别的难度。针对传统机器学习方法依赖专家经验、现有深度学习方法对加密流量特征表征能力不足等问题，提出一种在不解密的前提下自动提取空间特征和时序特征以进行加密恶意流量识别的CNN CBAM-BiGRU Attention模型。该模型分为空间特征提取与时序特征提取两部分：空间特征提取选用不同大小的一维卷积核，为了防止空间特征丢失，修改卷积层参数代替池化层进行特征压缩和去除冗余，再利用CBAM块对提取到的不同尺寸的空间特征进行加权，使得模型能够关注到区分度高的空间特征；时序特征提取部分利用双向门控循环单元来表征数据包之间的时序依赖关系，然后利用Attention来突出会话中重要的数据包。在此基础上，将两部分特征向量进行融合，利用Softmax分类器进行二分类和多分类。在公开数据集上进行实验，结果表明，该模型在二分类任务中的加密恶意流量识别准确率达到99.95%，在多分类任务中整体准确率达到99.39%，在Dridex与Zbot类别的加密恶意流量识别中F1值相比1D_CNN、BiGRU等模型有显著提高。

关键词: 网络安全, 加密恶意流量识别, 卷积神经网络, CBAM机制, 门控循环单元

Abstract:

Encrypting network traffic helps protect data security and user privacy; however, encryption also hides the characteristics of the data, making it difficult to identify malicious traffic. To address the problem of reliance on expert experience in traditional machine learning methods and insufficient representation of traffic in existing deep learning methods, this paper proposes a CNN CBAM-BiGRU Attention model to automatically extract spatial and temporal features without decryption, thereby enhancing the characterization of encrypted traffic features. The model is divided into two parts: spatial and temporal feature extraction. The spatial features are extracted by one-dimensional convolution kernels of different sizes. To prevent loss of spatial features, the parameters of the convolutional layer are modified to replace the feature compression and redundancy removal of the pooling layer, and CBAM is used to weight the extracted spatial features of different scales, so that the model can focus on spatial features with high differentiation. The time sequence feature selects the BiGRU to characterize the timing dependencies between data packets, whereby Attention is used to strengthen the role of important data packets. Finally, the two feature vectors are fused, and the Softmax classifier is used for binary classification as well as multi-classification. In the experiments conducted on public datasets, the proposed model achieved an accuracy of 99.95% in identifying encrypted malicious traffic in binary classification tasks, and an overall accuracy of 99.39% in multi-classification tasks. The F1 scores for encrypted malicious traffic in the Dridex and Zbot categories were significantly improved compared to those of 1D_CNN and BiGRU models.

Key words: cyber security, encrypted malicious traffic identification, Convolutional Neural Network(CNN), CBAM mechanism, Gated Recurrent Unit(GRU)

邓昕, 刘朝晖, 欧阳燕, 陈建华. 基于CNN CBAM-BiGRU Attention的加密恶意流量识别[J]. 计算机工程, 2023, 49(11): 178-186.

Xin DENG, Zhaohui LIU, Yan OUYANG, Jianhua CHEN. Encrypted Malicious Traffic Identification Based on CNN CBAM-BiGRU Attention[J]. Computer Engineering, 2023, 49(11): 178-186.

http://www.ecice06.com/CN/Y2023/V49/I11/178

图/表 15

图1 加密恶意流量检测模型结构

Fig.1 Structure of encrypted malicious traffic detection model

图2 TCP头部结构

Fig.2 Structure of the TCP header

图3 CBAM结构

Fig.3 CBAM structure

图4 通道注意力结构

Fig.4 Channel attention structure

图5 空间注意力结构

Fig.5 Spatial attention structure

图6 时序特征提取模块结构

Fig.6 Temporal feature extraction module structure

图7 预处理后的部分流量灰度图

Fig.7 Gray scale image of partially processed traffic

图8 准确率与迭代次数的关系

Fig.8 Relationship between accuracy and number of iterations

图9 不同参数w下的F1值

Fig.9 F1 values under different parameters w

图10 5种分类模型的实验结果对比

Fig.10 Comparison of experimental results of five classification models

图11 测试集结果的混淆矩阵

Fig.11 Confusion matrix of test set results

参考文献 25

1	Google. Google transparency report[EB/OL]. [2022-11-05]. https://transparency report.google.com/https/overview.
2	陈良臣, 高曙, 刘宝旭, 等. 网络加密流量识别研究进展及发展趋势. 信息网络安全, 2019, (3): 19- 25. URL
	CHEN L C, GAO S, LIU B X, et al. Research status and development trends on network encrypted traffic identification. Netinfo Security, 2019, (3): 19- 25. URL
3	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision. New York, USA: ACM Press, 2018: 3-19.
4	张蕾, 崔勇, 刘静, 等. 机器学习在网络空间安全研究中的应用. 计算机学报, 2018, 41 (9): 1943- 1975. URL
	ZHANG L, CUI Y, LIU J, et al. Application of machine learning in cyberspace security research. Chinese Journal of Computers, 2018, 41 (9): 1943- 1975. URL
5	ANDERSON B, MCGREW D. Identifying encrypted malware traffic with contextual flow data[C]//Proceedings of 2016 ACM Workshop on Artificial Intelligence and Security. New York, USA: ACM Press, 2016: 35-46.
6	胡斌, 周志洪, 姚立红, 等. 结合报文负载与流指纹特征的恶意流量检测. 计算机工程, 2020, 46 (11): 157- 163. URL
	HU B, ZHOU Z H, YAO L H, et al. Malicious traffic detection combining features of packet payload and stream fingerprint. Computer Engineering, 2020, 46 (11): 157- 163. URL
7	WANG W, ZHU M, WANG J L, et al. End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]//Proceedings of IEEE International Conference on Intelligence and Security Informatics. Washington D. C., USA: IEEE Press, 2017: 43-48.
8	WANG W, ZHU M, ZENG X W, et al. Malware traffic classification using convolutional neural network for representation learning[C]//Proceedings of International Conference on Information Networking. Washington D. C., USA: IEEE Press, 2017: 712-717.
9	韦佶宏, 郑荣锋, 刘嘉勇. 基于混合神经网络的恶意TLS流量识别研究. 计算机工程与应用, 2021, 57 (7): 107- 114. URL
	WEI J H, ZHENG R F, LIU J Y. Research on malicious TLS traffic identification based on hybrid neural network. Computer Engineering and Applications, 2021, 57 (7): 107- 114. URL
10	程华, 谢金鑫, 陈立皇. 基于CNN的加密C&C通信流量识别方法. 计算机工程, 2019, 45 (8): 31-34, 41. URL
	CHENG H, XIE J X, CHEN L H. CNN-based encrypted C&C communication traffic identification method. Computer Engineering, 2019, 45 (8): 31-34, 41. URL
11	王攀, 陈雪娇. 基于堆栈式自动编码器的加密流量识别方法. 计算机工程, 2018, 44 (11): 140-147, 153. URL
	WANG P, CHEN X J. SAE-based encrypted traffic identification method. Computer Engineering, 2018, 44 (11): 140-147, 153. URL
12	陈明豪, 祝跃飞, 芦斌, 等. 基于Attention-CNN的加密流量应用类型识别. 计算机科学, 2021, 48 (4): 325- 332. URL
	CHEN M H, ZHU Y F, LU B, et al. Classification of application type of encrypted traffic based on Attention-CNN. Computer Science, 2021, 48 (4): 325- 332. URL
13	邹源, 张甲, 江滨. 基于LSTM循环神经网络的恶意加密流量检测. 计算机应用与软件, 2020, 37 (2): 308- 312. URL
	ZOU Y, ZHANG J, JIANG B. Detection of malicious encrypted traffic based on LSTM recurrent neural network. Computer Applications and Software, 2020, 37 (2): 308- 312. URL
14	LIU X, YOU J, WU Y, et al. Attention-based bidirectional GRU networks for efficient HTTPS traffic classification. Information Sciences, 2020, 541, 297- 315. doi: 10.1016/j.ins.2020.05.035
15	吴迪, 方滨兴, 崔翔, 等. BotCatcher: 基于深度学习的僵尸网络检测系统. 通信学报, 2018, 39 (8): 18- 28. URL
	WU D, FANG B X, CUI X, et al. BotCatcher: botnet detection system based on deep learning. Journal on Communications, 2018, 39 (8): 18- 28. URL
16	ZOU Z, GE J G, ZHENG H B, et al. Encrypted traffic classification with a convolutional long short-term memory neural network[C]//Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications. Washington D. C., USA: IEEE Press, 2019: 329-334.
17	李小剑, 谢晓尧, 徐洋, 等. 基于CNN⁃SIndRNN的恶意TLS流量快速识别方法. 计算机工程, 2022, 48 (4): 148-157, 164. URL
	LI X J, XIE X Y, XU Y, et al. Fast identification method of malicious TLS traffic based on CNN-SIndRNN. Computer Engineering, 2022, 48 (4): 148-157, 164. URL
18	张彦晖, 吕娜, 刘鹏飞, 等. 基于卷积注意力门控循环网络的加密流量分类方法. 信号处理, 2021, 37 (7): 1180- 1188. URL
	ZHANG Y H, LÜ N, LIU P F, et al. An encrypted traffic classification method based on convolutional attention gated recurrent networks. Journal of Signal Processing, 2021, 37 (7): 1180- 1188. URL
19	蒋彤彤, 尹魏昕, 蔡冰, 等. 基于层次时空特征与多头注意力的恶意加密流量识别. 计算机工程, 2021, 47 (7): 101- 108. URL
	JIANG T T, YIN W X, CAI B, et al. Encrypted malicious traffic identification based on hierarchical spatiotemporal feature and multi-head attention. Computer Engineering, 2021, 47 (7): 101- 108. URL
20	孙懿, 高见, 顾益军. 融合一维Inception结构与ViT的恶意加密流量检测. 计算机工程, 2023, 49 (1): 154- 162. URL
	SUN Y, GAO J, GU Y J. Malicious encrypted traffic detection integrating one-dimensional Inception structure and ViT. Computer Engineering, 2023, 49 (1): 154- 162. URL
21	LIN X J, XIONG G, GOU G P, et al. ET-BERT: a contextualized datagram representation with pre-training transformers for encrypted traffic classification[C]//Proceedings of the ACM Web Conference. New York, USA: ACM Press, 2022: 633-642.
22	周飞燕, 金林鹏, 董军. 卷积神经网络研究综述. 计算机学报, 2017, 40 (6): 1229- 1251. URL
	ZHOU F Y, JIN L P, DONG J. Review of convolutional neural network. Chinese Journal of Computers, 2017, 40 (6): 1229- 1251. URL
23	WANG W, SHENG Y Q, WANG J L, et al. HAST-IDS: learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access, 2017, 6, 1792- 1806. doi: 10.1109/ACCESS.2017.2780250
24	CTU University. The stratosphere IPS project dataset[EB/OL]. [2022-11-05]. https://stratosphereips.org/category/dataset.html, 2017-08.
25	DRAPER-GIL G, LASHKARI A H, MAMUN M S I, et al. Characterization of encrypted and VPN traffic using time-related features[C]//Proceedings of the 2nd International Conference on Information Systems Security and Privacy. Washington D. C., USA: IEEE Press, 2016: 407-414.

[1]	朱孟栩, 张文豪, 李国洪, 顾行发, 余涛, 郑逢杰, 张丽丽, 吴俣, 邴芳飞, 唐健雄. 基于卷积神经网络的高分六号卫星多光谱图像压缩[J]. 计算机工程, 2023, 49(9): 287-294.
[2]	李现国, 李滨. 基于Transformer和多尺度CNN的图像去模糊[J]. 计算机工程, 2023, 49(9): 226-233, 245.
[3]	杜逸潇, 王红军, 李修和. 基于频谱地图的辐射源指纹定位方法研究[J]. 计算机工程, 2023, 49(9): 183-190, 198.
[4]	胡水. 基于深度强化学习的智能兵棋推演决策方法[J]. 计算机工程, 2023, 49(9): 303-312.
[5]	韩璐, 霍纬纲, 张永会, 刘涛. 基于多尺度特征融合与双注意力机制的多元时间序列预测[J]. 计算机工程, 2023, 49(9): 99-108.
[6]	李哲铭, 王晋东, 侯建中, 李伟, 张世华, 张恒巍. 基于显著区域优化的对抗样本攻击方法[J]. 计算机工程, 2023, 49(9): 246-255, 264.
[7]	余长宏, 陆雅, 王海鑫, 高明. 基于滑动时间窗的物联网设备流量分类算法[J]. 计算机工程, 2023, 49(7): 259-268.
[8]	曹坪, 杨怀志, 薄一军, 尤嘉, 张淳杰, 李丹勇. 面向低质量裂缝图像的多知识蒸馏分类[J]. 计算机工程, 2023, 49(7): 204-213.
[9]	白明昌. 基于折叠路径聚合的属性网络节点嵌入方法[J]. 计算机工程, 2023, 49(7): 76-84.
[10]	代祖华, 刘园园, 狄世龙. 语义增强的图神经网络方面级文本情感分析[J]. 计算机工程, 2023, 49(6): 71-80.
[11]	沈学利, 田桂源, 姜彦吉, 马琳琳. 基于双阶段Conv-Transformer的时频域语音增强算法[J]. 计算机工程, 2023, 49(6): 123-130.
[12]	丁子轩, 俞雷, 张娟, 李想, 王新宇. 基于深度残差自适应注意力网络的图像超分辨率重建[J]. 计算机工程, 2023, 49(5): 231-238.
[13]	陈治旭, 靳雁霞, 芦烨, 杨晶, 刘亚变, 史志儒. 基于子图卷积神经网络的多精度服装建模方法[J]. 计算机工程, 2023, 49(4): 174-181.
[14]	徐康, 李霏, 姬东鸿. 结合依存图卷积与文本片段搜索的方面情感三元组抽取[J]. 计算机工程, 2023, 49(4): 61-67.
[15]	衡红军, 苗菁. 语义与句法信息加强的二元标记实体关系联合抽取[J]. 计算机工程, 2023, 49(4): 77-84.

选择文件类型/文献管理软件名称

选择包含的内容