作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (9): 192-200. doi: 10.19678/j.issn.1000-3428.0070021

• 网络空间安全 • 上一篇    下一篇

基于有监督自编码器的TLS加密异常流量检测

杨明芬1, 甘昀2, 张兴鹏2,*()   

  1. 1. 西藏自治区科技信息研究所,西藏 拉萨 851400
    2. 西南石油大学计算机与软件学院,四川 成都 610500
  • 收稿日期:2024-06-20 修回日期:2024-08-22 出版日期:2025-09-15 发布日期:2025-09-26
  • 通讯作者: 张兴鹏
  • 基金资助:
    西南石油大学自然科学“启航计划”项目(2022QHZ023); 西南石油大学自然科学“启航计划”项目(2022QHZ013); 四川省科技创新人才基金(2022JDRC0009); 四川省自然科学基金(2022NSFSC0283); 四川省科技厅重点研发项目(2023YFG0129)

Transport Layer Security-Encrypted Abnormal Traffic Detection Based on Supervised Autoencoder

YANG Mingfen1, GAN Yun2, ZHANG Xingpeng2,*()   

  1. 1. Tibet Autonomous Region Institute of Science and Technology Information, Lhasa 851400, Xizang, China
    2. School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, Sichuan, China
  • Received:2024-06-20 Revised:2024-08-22 Online:2025-09-15 Published:2025-09-26
  • Contact: ZHANG Xingpeng

摘要:

随着用户对隐私保护意识的增强,越来越多的网站和服务使用传输层安全(TLS)协议来保护用户数据,这导致TLS加密流量在网络传输流量中的占比越来越高。但目前大多数异常流量检测方法是针对所有流量或所有加密流量的通用检测模型,而专门研究TLS加密流量的方法较少。因此,提出一种基于有监督自编码器的TLS加密异常流量检测方法。该方法的核心是训练一个有监督自编码器,其将网络流量作为输入,生成与输入流量维度相同的重构流量,并要求正常流量与对应的重构流量之间相似度极高,异常流量与重构流量之间相似度极低。为达到上述重构要求,设计一个重构损失函数来有监督地优化自编码器内部参数。在检测阶段,利用自编码器的重构能力,通过衡量输入流量与重构流量之间的余弦相似度来判断输入流量是否为异常流量。此外,通过整合数据构建一个专门用于TLS加密异常流量检测任务的数据集,在此数据集上的实验结果表明,该方法在TLS加密异常流量检测二分类任务上的准确率达到99.52%,优于其他对比模型,同时多种可视化策略展现了所提方法的有效性。

关键词: TLS加密, 自编码器, 异常流量检测, 重构损失, 可视化分析

Abstract:

As user awareness of privacy protection increases, an increasing number of websites and services are employing the Transport Layer Security (TLS) protocol to safeguard user data. Consequently, the proportion of TLS-encrypted traffic within overall network traffic is steadily increasing. However, most current abnormal traffic detection methods are general-purpose models that target all traffic or all encrypted traffic. Methods that specifically focus on TLS-encrypted traffic are few. Therefore, this study proposes a supervised autoencoder-based method for detecting abnormal TLS-encrypted traffic. This method focuses on training a supervised autoencoder that uses network traffic as the input and generates reconstructed traffic with the same dimensionality as that of the input. The model requires extremely high similarity between normal traffic and its corresponding reconstructed traffic, whereas the similarity between abnormal traffic and its reconstructed counterpart should be extremely low. To achieve these reconstruction requirements, a reconstruction loss function is designed to supervise and optimize the internal parameters of the autoencoder. During the detection phase, the reconstruction capability of the autoencoder is utilized to determine whether the input traffic is abnormal, by measuring the cosine similarity between the input and reconstructed traffic. Furthermore, a specialized dataset tailored for TLS-encrypted abnormal traffic detection is constructed by integrating relevant data. Experimental results on this dataset demonstrate that the proposed method achieves an accuracy of 99.52% in the binary classification task of TLS-encrypted abnormal traffic detection, outperforming other comparative models. In addition, various visualization strategies are employed to demonstrate the effectiveness of the proposed method.

Key words: TLS encryption, autoencoder, abnormal traffic detection, reconstruction loss, visual analysis