Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

   

An attention-based BERT-CNN-GRU detection method

  

  • Published:2024-04-19

一种基于注意力机制的BERT-CNN-GRU检测方法

Abstract: To address the issue of poor detection performance of existing methods on short domain names, a detection approach combining BERT-CNN-GRU with an attention mechanism is proposed. Initially, BERT is employed to extract the effective features and inter-character composition logic of the domain name. Subsequently, a parallel fusion simplifying attention Convolutional Neural Network (CNN) and Gated Recurrent Network (GRU) based on the multi-head attention mechanism are used to extract deep features of the domain name. The CNN, organized in an n-gram arrangement, can extract domain name information at different levels. Batch Normalization (BN) is applied to optimize the convolution results. GRU is utilized to better capture the composition differences of domain names before and after, and the multi-head attention mechanism excels in capturing the internal composition relationships of domain names. Concatenating the results of parallel detection network output, by maximizing the advantages of both networks and employing a local loss function to focus on the domain name classification problem, the classification performance is ultimately improved. Experimental results demonstrate that the model achieves optimal performance in binary classification. Specifically, on the short-domain multi-classification dataset, the Weighted F1-score for 15 categories reaches 86.21%, surpassing the BiLSTM-Seq-Attention model by 0.88%. On the UMUDGA dataset, the Weighted F1-score for 50 categories reaches 85.51%, representing an improvement of 0.45%. Moreover, the model exhibits outstanding performance in detecting variant domain names and word DGA, showcasing the ability to handle imbalances in domain name data distribution and a broader range of detection capabilities.

摘要: 针对现有检测方法对短域名检测性能普遍较差的问题,提出了一种BERT-CNN-GRU 结合注意力机制的检测方法。首先通过BERT提取域名的有效特征和字符间组成逻辑,再通过并行的融合简化注意力的卷积神经网络(CNN)和基于多头注意力机制的门控循环网络(GRU)提取域名深度特征。使用形如n-gram排布的CNN能够提取不同层次的域名信息,采用批标准化(Batch Normalization, BN)对卷积结果进行优化;使用GRU能够更好获取前后域名的组成差异,加上多头注意力机制善于捕获域名内部的组成关系。对并行检测网络输出的结果进行拼接,在最大限度上利用了两种网络的优势,采用局部损失函数,聚焦域名分类问题,最终提高了分类性能。实验结果表明:在二分类上,模型达到了最优效果,在短域名多分类数据集上15分类的Weighted F1-score达到了86.21%,比BiLSTM-Seq-Attention模型提高了0.88%,在UMUDGA数据集上50分类的Weighted F1-score达到了85.51%,提高了0.45%,并且模型对变体域名和单词DGA检测性能出众,具有域名数据分布不平衡检测能力和更广泛的检测性能。