基于CNN-SIndRNN的恶意TLS流量快速识别方法

doi:10.19678/j.issn.1000-3428.0061003

计算机工程 ›› 2022, Vol. 48 ›› Issue (4): 148-157,164. doi: 10.19678/j.issn.1000-3428.0061003

基于CNN-SIndRNN的恶意TLS流量快速识别方法

李小剑¹, 谢晓尧^1,2, 徐洋², 张思聪²

1. 贵州师范大学数学科学学院贵阳 550001;
2. 贵州师范大学贵州省信息与计算科学重点实验室贵阳 550001

收稿日期:2020-03-04 修回日期:2020-04-15 发布日期:2021-04-15
作者简介:李小剑(1981—),男,博士研究生,主研方向为网络空间安全、深度学习;谢晓尧,教授、博士、博士生导师;徐洋,教授、博士;张思聪,博士。
基金资助:
中央引导地方科技发展专项资金（黔科中引地［2018］4008）；贵州省科技计划项目（黔科合支撑［2020］2Y013）；贵州省研究生教育创新计划项目（黔教合YJSCXJH［2019］043）。

Fast Identification Method of Malicious TLS Traffic Based on CNN-SIndRNN

LI Xiaojian¹, XIE Xiaoyao^1,2, XU Yang², ZHANG Sicong²

1. School of Mathematical Science, Guizhou Normal University, Guiyang 550001, China;
2. Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang 550001, China

Received:2020-03-04 Revised:2020-04-15 Published:2021-04-15

摘要/Abstract

摘要： 传统浅层机器学习方法在识别恶意TLS流量时依赖专家经验且流量表征不足，而现有的深度神经网络检测模型因层次结构复杂导致训练时间过长。提出一种基于CNN-SIndRNN端到端的轻量级恶意加密流量识别方法，使用多层一维卷积神经网络提取流量字节序列局部模式特征，并利用全局最大池化降维以减少计算参数。为增强流量表征，设计一种改进的循环神经网络用于捕获流量字节长距离依赖关系。在此基础上，采用独立循环神经网络IndRNN单元代替传统RNN循环单元，使用切片并行计算结构代替传统RNN的串行计算结构，并将两种类型深度神经网络所提取的特征拼接作为恶意TLS流量表征。在CTU-Maluware-Capure公开数据集上的实验结果表明，该方法在二分类实验上F1值高达0.965 7，在多分类实验上整体准确率为0.848 9，相比BotCatcher模型训练时间与检测时间分别节省了98.47%和98.28%。

关键词: 恶意TLS流量, 独立循环神经网络, 切片循环神经网络, 一维卷积, 全局池化

Abstract: Traditional shallow machine learning methods for identifying malicious TLS traffic rely heavily on expert experience, and perform poorly in traffic representation.In addition, the training of the existing deep neural network detection models is time-consuming due to the deepened hierarchical structure.To address the problem, a lightweight end-to-end method for malicious encrypted traffic detection is proposed based on CNN-SIndRNN.The method employs a multi-layer one-dimensional convolutional neural network to extract the local pattern features of a traffic byte sequence, and uses global maximum pooling to reduce dimensions to simplify computational parameters.At the same time, to enhance traffic representation, an improved recurrent neural network is designed in parallel to capture the long-distance dependence of traffic bytes.On this basis, the Independent Recurrent Neural Network (IndRNN) unit is used to replace the traditional Recurrent Neural Network (RNN) unit, and the sliced parallel computing structure is adopted to replace the serial computing structure of the traditional RNN.Then, the features extracted from the two types of deep neural networks are spliced to represent the malicious TLS traffic.The effectiveness of the proposed method is verified on two open datasets.The experimental results show that the method exhibits a F1 score of 0.965 7 in the binary classification experiment.Its overall accuracy rate reaches 84.89% in the multi-classification experiment.Compared with the model of BotCatcher, CNN-SIndRNN model improves the classification performance while reducing the training time by 98.47% and test time by 98.28%.

Key words: malicious TLS traffic, independently recurrent neural network, sliced recurrent neural network, one dimensional convolution neural network, global pooling

中图分类号:

TP309

李小剑, 谢晓尧, 徐洋, 张思聪. 基于CNN-SIndRNN的恶意TLS流量快速识别方法[J]. 计算机工程, 2022, 48(4): 148-157,164.

LI Xiaojian, XIE Xiaoyao, XU Yang, ZHANG Sicong. Fast Identification Method of Malicious TLS Traffic Based on CNN-SIndRNN[J]. Computer Engineering, 2022, 48(4): 148-157,164.

https://www.ecice06.com/CN/Y2022/V48/I4/148

图/表 15

20230131201250

20230131201253

20230131201256

20230131201259

20230131201302

20230131201305

20230131201309

20230131201312

20230131201315

20230131201318

20230131201321

20230131201324

20230131201327

20230131201332

20230131201335

参考文献

[1] CISCO.Encrypted traffic analytics white paper[EB/OL]. 2020-05-07].https://www.cisco.com/c/dam/en/us/solutios/collateral/enterprise-networks/enterprise-network-security/nb-09-encryted-traf-anlytcs-wp-cte-en.pdf.
[2] 陈良臣, 高曙, 刘宝旭, 等.网络加密流量研究进展及发展趋势[J].信息安全网络, 2019, 19(3):19-25. CHEN L C, GAO S, LIU B X, et al.Research status and development trends on network encrypted traffic identification[J].Netinfo Security, 2019, 19(3):19-25.(in Chinese)
[3] 徐国天.基于异常加密流量标注的Android恶意进程识别方法研究[J].信息安全网络, 2020, 20(7):30-41. XU G T.Android malicious process identification method based on abnormal encrypted traffic annotation[J].Netinfo Security, 2020, 20(7):30-41.(in Chinese)
[4] ANISH S, SHEKHA W, FABIO D, et al.Feature analysis of encrypted malicious traffic[J].Expert Systems with applications, 2019, 125:130-141.
[5] WANG W, ZHU M, WANG J, et al.End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]//Proceedings of 2017 IEEE International Conference on Intelligence and Security Informatics.Washington D.C., USA:IEEE Press, 2017:43-52.
[6] ZOU Z, GE J G, ZHENG H B, et al.Encrypted traffic classification with a convolutional long short-term memory neural network[C]//Proceedings of 2018 IEEE International Conference on High Performance Computing and Communications.Washington D.C., USA:IEEE Press, 2018:329-334.
[7] 吴迪, 方滨兴, 崔翔, 等.BotCatcher:基于深度学习的僵尸网络检测系统[J].通信学报, 2018, 39(8):18-28. WU D, FANG B X, CUI X, et al.BotCatcher:botnet detection system based on deep learning[J].Journal on Communications, 2018, 39(8):18-28.(in Chinese)
[8] ANDERSON B, PAUL S, MCGREW D.Deciphering malware's user of TLS (without decryption)[J].Journal of Computer Virology and Hacking Techniques, 2018, 14(3):195-211.
[9] 胡斌, 周志洪, 姚立红, 等.结合报文负载与流指纹特征的TLS恶意流量检测[J].计算机工程, 2020, 46(11):157-163. HU B, ZHOU Z H, YAO L H, et al.TLS malicious traffic detection combining features of packet payload and stream fingerprint[J].Computer Engineering, 2020, 46(11):157-163.(in Chinese)
[10] SCHUPPEN S, TEUBERT D, HERRMANN P, et al.FANCI:feature-based automated nxdomain classifica- tion and intelligence[C]//Proceedings of the 27th USENIX Conference on Security Symposium.New York, USA:ACM Press, 2018:1165-1181.
[11] WANG W, ZHU M, ZENG X, et al.Malware traffic classification using convolutional neural network for repressentation learning[C]//Proceedings of 2017 International Conference on Information Networking.Washington D.C., USA:IEEE Press, 2017:712-717.
[12] 程华, 谢金鑫, 陈立皇.基于CNN的加密C&C通信流量识别方法[J].计算机工程, 2019, 45(8):31-34, 41. CHENG H, XIE J X, CHEN L H.CNN-based encrypted C&C communication traffic identification method[J].Computer Engineering, 2019, 45(8):31-34, 41.(in Chinese)
[13] 邹源, 张甲, 江滨.基于LSTM循环神经网络的恶意加密流量检测[J].计算机应用与软件, 2020, 37(2):308-312. ZOU Y, ZHANG J, JIANG B.Detection of malicious encrypted traffic based on LSTM recurrent neural network[J].Computer Applications and Software, 2020, 37(2):308-312.(in Chinese)
[14] CONSTANTINOS P, FRAN C, VASILIOS K.Encrypted and covert DNS queries for botnets:challenges and counter measures[EB/OL].[2020-01-17].https://www.researchgate.net/publication/335854377_Encrypted_and_Covert_DNS_Queries_for_Botnets_Challenges_and_Countermeasures.
[15] WANG W, SHENG Y, WANG J, et al.HAST-IDS:learning hierarchical spatial-temporal features using deep neural networks to Improve Intrusion Detection[J].IEEE Access, 2018, 6:1792-1806.
[16] 刘洋, 赵科军, 葛连升, 等.一种基于深度学习的快速DGA域名分类算法[J].山东大学学报(理学版), 2019, 54(7):1-8. LIU Y, ZHAO K J, GE L S, et al.A fast DGA domain detection algorithm based on deep learning[J].Journal of Shang dong University(Natural Science), 2019, 54(7):1-8.(in Chinese)
[17] LI S, LI W, COOK C, et al.Independently Recurrent Neural Network (indRNN):building a longer and deeper RNN[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:5457-5466.
[18] YU Z P, LIU G S.Sliced recurrent ceural networks[C]//Proceedings of the 27th International Conference on Computational Linguistics.Washington D.C., USA:IEEE Press, 2018:59-69.
[19] STRATOSPHERE L.Malware capture facility project[EB/OL].[2020-01-17].https://www.stratosphereips.org/datasets-malware/normal.
[20] BRAD A.Malware traffic analysis[EB/OL].[2020-01-17].https://www.malware-traffic-analysis.net.

选择文件类型/文献管理软件名称

选择包含的内容

基于CNN-SIndRNN的恶意TLS流量快速识别方法

Fast Identification Method of Malicious TLS Traffic Based on CNN-SIndRNN

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献

相关文章 3

编辑推荐

Metrics

本文评价

[1]	曹渝昆, 魏健强, 孙涛, 徐越. 基于IndRNN与BN的深层图像描述模型[J]. 计算机工程, 2021, 47(10): 194-200.
[2]	张海涛, 张梦. 引入通道注意力机制的SSD目标检测算法[J]. 计算机工程, 2020, 46(8): 264-270.
[3]	殷佳豪, 刘世杰, 鲍宇, 杨轩, 朱紫维. 基于一维卷积神经网络的实时心脏按压评估[J]. 计算机工程, 2020, 46(5): 298-304,311.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于CNN-SIndRNN的恶意TLS流量快速识别方法

Fast Identification Method of Malicious TLS Traffic Based on CNN-SIndRNN

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献

相关文章 3

编辑推荐

Metrics

本文评价