作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (2): 180-187. doi: 10.19678/j.issn.1000-3428.0067077

• 网络空间安全 • 上一篇    下一篇

基于Transformer和GAN的对抗样本生成算法

刘帅威*(), 李智, 王国美, 张丽   

  1. 贵州大学计算机科学与技术学院公共大数据国家重点实验室, 贵州 贵阳 550025
  • 收稿日期:2023-03-03 出版日期:2024-02-15 发布日期:2024-02-19
  • 通讯作者: 刘帅威
  • 基金资助:
    国家自然科学基金(62062023)

Adversarial Example Generation Algorithm Based on Transformer and GAN

Shuaiwei LIU*(), Zhi LI, Guomei WANG, Li ZHANG   

  1. State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, Guizhou, China
  • Received:2023-03-03 Online:2024-02-15 Published:2024-02-19
  • Contact: Shuaiwei LIU

摘要:

对抗攻击与防御是计算机安全领域的一个热门研究方向。针对现有基于梯度的对抗样本生成方法可视质量差、基于优化的方法生成效率低的问题,提出基于Transformer和生成对抗网络(GAN)的对抗样本生成算法Trans-GAN。首先利用Transformer强大的视觉表征能力,将其作为重构网络,用于接收干净图像并生成攻击噪声;其次将Transformer重构网络作为生成器,与基于深度卷积网络的鉴别器相结合组成GAN网络架构,提高生成图像的真实性并保证训练的稳定性,同时提出改进的注意力机制Targeted Self-Attention,在训练网络时引入目标标签作为先验知识,指导网络模型学习生成具有特定攻击目标的对抗扰动;最后利用跳转连接将对抗噪声施加在干净样本上,形成对抗样本,攻击目标分类网络。实验结果表明:Trans-GAN算法针对MNIST数据集中2种模型的攻击成功率都达到99.9%以上,针对CIFAR10数据集中2种模型的攻击成功率分别达到96.36%和98.47%,优于目前先进的基于生成式的对抗样本生成方法;相比快速梯度符号法和投影梯度下降法,Trans-GAN算法生成的对抗噪声扰动量更小,形成的对抗样本更加自然,满足人类视觉不易分辨的要求。

关键词: 深度神经网络, 对抗样本, 对抗攻击, Transformer模型, 生成对抗网络, 注意力机制

Abstract:

Adversarial attack and defense is a popular research area in computer security. Trans-GAN, an adversarial example generation algorithm based on the combination of Transformer and Generate Adversarial Network(GAN), is proposed to address the problems of the poor visual quality of existing gradient-based adversarial example generation methods and the low generation efficiency of optimization-based methods. First, the algorithm utilizes the powerful visual representation capability of the Transformer as a reconstruction network for receiving clean images and generating adversarial noise. Second, the Transformer reconstruction network is combined with a deep convolutional network-based discriminator as a generator to form a GAN architecture, which improves the authenticity of the generated images and ensures the stability of training. Meanwhile, the improved attention mechanism, Targeted Self-Attention, is proposed to introduce target labels as a priori knowledge when training the network, which guides the network model to learn to generate adversarial perturbations with specific attack targets. Finally, adversarial noise is added to the clean examples using skip-connections to form adversarial examples. Experimental results demonstrate that the proposed algorithm achieves an attack success rate of more than 99.9% on both models used for the MNIST dataset and 96.36% and 98.47% on the two models used for the CIFAR10 dataset, outperforming the current state-of-the-art generative-based adversarial attack methods. The qualitative results show that compared to the Fast Gradient Sign Method(FGSM)and Projected Gradient Descent(PGD)algorithms, the generated adversarial noise of the Trans-GAN algorithm is less perturbed, and the formed adversarial examples are more natural and meet the requirements of human vision, which is not easily distinguished.

Key words: deep neural network, adversarial example, adversarial attack, Transformer model, Generate Adversarial Network(GAN), attention mechanism