作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (1): 154-162. doi: 10.19678/j.issn.1000-3428.0064785

• 网络空间安全 • 上一篇    下一篇

融合一维Inception结构与ViT的恶意加密流量检测

孙懿1, 高见1,2, 顾益军1   

  1. 1. 中国人民公安大学 信息网络安全学院, 北京 100038;
    2. 安全防范与风险评估公安部重点实验室, 北京 102623
  • 收稿日期:2022-05-23 修回日期:2022-06-23 发布日期:2022-08-31
  • 作者简介:孙懿(1998-),女,硕士研究生,主研方向为网络安全、恶意流量检测;高见(通信作者),副教授、博士;顾益军,教授、博士。
  • 基金资助:
    公安部科技强警基础工作2020专项基金(2020GABJC01);中国人民公安大学基本科研业务费项目(2021JKF206)。

Malicious Encrypted Traffic Detection Integrating One-Dimensional Inception Structure and ViT

SUN Yi1, GAO Jian1,2, GU Yijun1   

  1. 1. College of Information Network Security, People's Public Security University of China, Beijing 100038, China;
    2. Key Laboratory of Safety Precautions and Risk Assessment, Ministry of Public Security, Beijing 102623, China
  • Received:2022-05-23 Revised:2022-06-23 Published:2022-08-31

摘要: 在互联网加密化背景下,传统恶意流量检测方法在加密流量上的特征区分度较差,为更好地从加密流量中检测出恶意流量,设计一个融合一维Inception-ViT的恶意加密流量检测模型。基于流量数据的时序性特点,通过一维Inception结构对GoogLeNet中的Inception结构进行改进,使用适用于序列数据的一维卷积替换二维卷积,并添加池化操作去除一些冗余信息的干扰。同时,融合ViT模型,将经过一维Inception结构处理后的数据输入到ViT模型中,利用多头注意力突出重要特征,增强特征区分度以提升模型检测结果。为验证一维Inception-ViT模型各模块的有效性,与6种变体模型进行对比,实验结果表明,一维Inception-ViT模型性能最好,平均召回率和平均F1值指标分别达到了99.42%和99.39%。此外,与其他8种现有模型进行比较,一维Inception-ViT模型具有更好的检测效果,同时在恶意加密流量Neris和Virut细粒度分类上,与性能最好的基准模型相比,一维Inception-ViT模型能够有效减少样本检测混淆,可更准确地对恶意加密流量进行识别。

关键词: 加密流量, 恶意加密流量检测, 多分类, 卷积神经网络, Vision Transformer模型

Abstract: In Internet encryption, traditional malicious traffic detection performs poorly in distinguishing encrypted traffic.To detect malicious traffic in encrypted traffic better, this paper designs a malicious encryption traffic detection model integrating a one-dimensional Inception structure and Vision Transformer(ViT).Based on the timing characteristics of traffic data, the one-dimensional Inception structure improves the Inception structure in GoogLeNet, replaces two-dimensional convolution with one-dimensional convolution, which is more suitable for sequence data, and adds a pooling operation to remove the interference of some redundant information.At the same time, the ViT model is fused.After being processed by the one-dimensional Inception structure, the data are input into the ViT model.Multi-head attention is used to highlight important features and further enhance feature differentiation to improve the model detection results.To verify the effectiveness of each module of the one-dimensional Inception ViT model, six variant models are generated for comparison.The experimental results show that the performance of the one-dimensional Inception ViT model is the best, reaching 99.42% and 99.39% for average recall and average F1 value, respectively.In addition, in a comparison with eight other models, the one-dimensional Inception ViT model has better detection.In the fine-grained classification of malicious encrypted traffic Neris and Virut, compared with the best-performing baseline model, the one-dimensional Inception ViT model reduces the sample detection confusion, indicating that it can identify malicious encrypted traffic more accurately.

Key words: encryptied traffic, malicious encryptioed traffic detection, multi-classification, Convolutional Neural Network(CNN), Vision Transformer(ViT) model

中图分类号: