作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (11): 197-206. doi: 10.19678/j.issn.1000-3428.0068295

• 网络空间安全 • 上一篇    下一篇

基于BERT的多模型融合的Web攻击检测方法

袁平宇, 邱林*()   

  1. 长江大学计算机科学学院, 湖北 荆州 434023
  • 收稿日期:2023-08-28 出版日期:2024-11-15 发布日期:2024-11-01
  • 通讯作者: 邱林
  • 基金资助:
    湖北高校2020省级教研项目(2020418)

Web Attacks Detection Method Based on BERT with Multi-Model Fusion

YUAN Pingyu, QIU Lin*()   

  1. School of Computer Science, Yangtze University, Jingzhou 434023, Hubei, China
  • Received:2023-08-28 Online:2024-11-15 Published:2024-11-01
  • Contact: QIU Lin

摘要:

传统Web攻击检测方法准确率不高, 不能有效防范Web攻击。针对该问题, 提出一种基于变换器的双向编码器表示(BERT)的预训练模型、文本卷积神经网络(TextCNN)和双向长短期记忆网络(BiLSTM)多模型融合的Web攻击检测方法。先将HTTP请求进行预处理, 再通过BERT进行训练得到具备上下文依赖的特征向量, 并用TextCNN模型进一步提取其中的高阶语义特征, 作为BiLSTM的输入, 最后利用Softmax函数进行分类检测。在HTTP CSIC 2010和恶意URL检测两个数据集上对所提方法进行验证, 结果表明, 与支持向量机(SVM)、逻辑回归(LR)等传统的机器学习方法和现有较新的方法相比, 基于BERT的多模型融合的Web攻击检测方法在准确率、精确率、召回率和F1值指标上均表现更优(准确率和F1值的最优值都在99%以上), 能准确检测Web攻击。

关键词: Web攻击检测, 基于变换器的双向编码器表示, 多模型融合, HTTP请求, 文本卷积神经网络, 双向长短期记忆网络

Abstract:

Traditional Web attack detection methods have a low accuracy and cannot effectively prevent Web attacks.In this regard, we propose a detection method for Web attacks based on the multi-model fusion of converter-based Bidirectional Encoder Representations from Transformer(BERT) pre-training model, Text Convolutional Neural Network(TextCNN), and Bidirectional Long Short-Term Memory(BiLSTM) network. Initially, an HTTP request is preprocessed, followed by BERT training to obtain context-dependent feature vectors. Then, the TextCNN model is used to further extract higher-order semantic features as BiLSTM inputs, and the Softmax function is used for classification detection. The proposed BERT-based multi-model fusion Web attack detection method is verified using two datasets: HTTP CSIC 2010 and malicious URL detection. Compared with traditional machine learning methods, such as the Support Vector Machine(SVM), Logistic Regression(LR), and existing newer methods, the BERT-based multi-model fused Web attack detection method has better accuracy, precision, recall, and F1 value indicators, with a maximum accuracy and F1 score of more than 99%, and can better detect Web attacks.

Key words: Web attacks detection, Bidirectional Encoder Representations from Transformers(BERT), multi-model fusion, HTTP request, Text Convolutional Neural Network(TextCNN), Bidirectional Long Short-Term Memory(BiLSTM) network