作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (5): 165-172,180. doi: 10.19678/j.issn.1000-3428.0064805

• 网络空间安全 • 上一篇    下一篇

基于Stacking与多特征融合的加密恶意流量检测

霍跃华1,2, 赵法起1   

  1. 1. 中国矿业大学(北京) 机电与信息工程学院, 北京 100083;
    2. 中国矿业大学(北京) 网络与信息中心, 北京 100083
  • 收稿日期:2022-05-25 修回日期:2022-07-22 发布日期:2022-08-08
  • 作者简介:霍跃华(1981-),男,高级工程师,主研方向为网络安全、通信与监测;赵法起,硕士研究生。
  • 基金资助:
    国家重点研发计划(2016YFC0801800)。

Encrypted Malicious Traffic Detection Based on Stacking and Multi-Feature Fusion

HUO Yuehua1,2, ZHAO Faqi1   

  1. 1. School of Mechanical Electronic & Information Engineering, China University of Mining and Technology-Beijing, Beijing 100083, China;
    2. Network and Information Center, China University of Mining and Technology-Beijing, Beijing 100083, China
  • Received:2022-05-25 Revised:2022-07-22 Published:2022-08-08

摘要: 加密技术保护网络通信安全的同时,大量恶意软件也采用加密协议来隐藏其恶意行为。在现有基于机器学习的TLS加密恶意流量检测模型中,存在单模型检测算法对多粒度特征适用性差和混合流量检测误报率高的问题。提出基于Stacking策略和多特征融合的非解密TLS加密恶意流量检测方法。分析加密恶意流量特征多粒度的特点,提取流量的流特征、连接特征和TLS握手特征。对所提取的特征通过特征工程进行规约处理,从而减少计算开销。对规约处理后的3类特征分别建立随机森林、XGBoost和高斯朴素贝叶斯分类器模型学习隐藏在流量内部的规律。在此基础上,使用流指纹融合处理后的多维特征,利用Stacking策略组合3个分类器,构成DMMFC检测模型来识别网络中的TLS加密恶意流量。基于CTU-13公开数据集对构建的模型进行性能评估,实验结果表明,该方法在二分类实验上识别召回率高达99.93%,恶意流量检测的误报率低于0.10%,能够有效检测非解密的TLS加密恶意流量。

关键词: 加密恶意流量, TLS协议, Stacking策略, 特征降维, 多特征融合

Abstract: Although encryption technology protects network communications,plenty malware uses encryption protocols to hide malicious behavior.For the existing Transport Layer Security(TLS) encrypted malicious traffic detection techniques based on machine learning,a single model detection algorithm is available for multi-granularity features,poor applicability,and a high false alarm rate of mixed traffic detection problems.A non-decryption TLS-encrypted malicious traffic detection method based on Stacking strategy and multi-feature fusion is proposed.The multigranularity of encrypted malicious traffic features is analyzed to extract the flow features,connection features,and TLS handshake features of the traffic.The extracted features are statutorily processed using feature engineering to reduce computational overhead.The Random Forest(RF),XGBoost,and Gaussian Naive Bayesian(GNB) classifier models are built for the three classes of features after statute processing to learn the hidden patterns inside them.Using the multidimensional features processed via stream fingerprint fusion,three classifier models are combined using a Stacking strategy to form DMMFC detection model to identify TLS-encrypted malicious traffic in the network.The performance of the constructed model is evaluated on the CTU-13 public dataset.The experimental results show that the identification recall of the proposed method is dimensionality of 99.93% in binary classification experiments and a False Alarm Rate(FAR) is less than 0.10% in malicious traffic detection.In can effectively detect non-decrypted TLS encrypted malicious traffic.

Key words: encrypted malicious traffic, Transport Layer Security(TLS)protocol, Stacking strategy, feature dimensionality reduction, muti-feature fusion

中图分类号: