计算机工程

• •    

基于双对抗机制的图像攻击方法研究

  

  • 出版日期:2020-12-09 发布日期:2020-12-09

Research on an Image Attack Method Based on Bi-Adversary

  • Online:2020-12-09 Published:2020-12-09

摘要: 图像攻击是指通过对图像添加小幅扰动而使深度神经网络失效。现有图像攻击算法大多较为脆弱,只需对攻击样本 做少量预处理就可能使其失去攻击能力。本文针对现有图像攻击方法在 VAE 防御下攻击性能不稳定的问题,在现有的 AdvGAN 算法基础上,提出一种基于对抗机制的 AntiVAEGAN 模型以获取对 VAE 防御的稳定攻击效果。针对 AntiVAEGAN 在防御模型防御能力提升时攻击性能不稳定的问题,进一步提出通过生成器与鉴别器、生成器与 VAE 的双对抗机制得到一种 新的图像攻击模型 VAEAdvGAN。对 MNIST 数据集和 GTSRB 数据集的实验结果表明,在无防御的情况下,AntiVAEGAN 和 VAEAdvGAN 几乎能达到和 AdvGAN 一样的分类准确率和攻击成功率;在 VAE 防御下,VAEAdvGAN 效果比 AdvGAN 更好, 也在大部分情况下优于 AntiVAEGAN。

Abstract: Image attack refers to adding a small degree of disturbance to the input images, which can cause the deep neural network to misclassify the image. Recent researches have shown that most image attack algorithms are relatively fragile. A very small change of the image causing by the attack disturbance can make the image lose its attack capability. Aiming at this problem, this paper proposes an image attack model, AntiVAEGAN, which has anti-VAE defense characteristics and is able to make the attack more robust. On the basis of AntiVAEGAN, the paper also proposes a VAEAdvGAN model which dynamically updates both VAEAdvGAN and VAE defense model. The attack model is based on attack sample generation network and VAE defense network countermeasure training method. Experiment results on MNIST dataset and GTSRB dataset show that AntiVAEGAN and VAEAdvGAN can achieve almost the same classification accuracy and attack success rate as AdvGAN without defense. In the case of VAE defense, VAEAdvGAN is better than AdvGAN, and in most cases is better than AntiVAEGAN.