基于稳定Adam和空间域变换的对抗样本生成算法

doi:10.19678/j.issn.1000-3428.0066467

摘要/Abstract

摘要：

深度神经网络广泛应用于图像分类、目标检测、自然语言处理等领域，但其容易受到对抗样本攻击。现有的多数攻击都是基于快速梯度符号法，通过在输入中添加相同幅度的扰动达到攻击效果，这些方法虽然有效但并不利于快速找到具有泛化能力的对抗样本。针对对抗样本的泛化性，提出一种结合稳定自适应矩估计和空间域变换的梯度优化算法来改进现有的对抗样本生成算法。将Nesterov算法引入一阶矩估计的更新中，基于AdaBelief算法，将Belief参数应用于二阶矩估计，同时根据指数衰减率计算衰减步长以获取更稳定的梯度。从数据增强的角度考虑，在对抗样本生成的过程中将输入样本在空间域进行变换，通过加权不同变换的梯度来更新原有梯度，从而提高对抗样本的可迁移性。实验结果表明，改进算法对抗样本性能显著提升，其白盒攻击成功率能够保持在99.6%以上，同时黑盒攻击成功率可提高到74.5%。

关键词: 对抗样本, 梯度优化, 矩估计, 图像变换, 可迁移性, 黑盒攻击

Abstract:

Deep neural networks have been widely used in natural language processing, target detection, and image classification. However, relevant studies have shown that deep neural networks are vulnerable to counter-sample attacks. Several existing attacks are based on the fast gradient sign method, which adds a disturbance of the same size to the input to achieve an attack effect. Although these methods are effective, they are not conducive to quickly finding adversarial examples with generalization ability.Therefore, to generalize the countermeasure samples, a gradient optimization method for stable adaptive moment estimation and spatial domain transformation is proposed to improve the existing algorithm for countermeasure sample generation. First, the Nesterov algorithm is introduced to update the first-order moment estimation. Inspired by the AdaBelief algorithm, the Belief parameter is introduced to the second-order moment estimation, and the decay step is calculated according to the exponential decay rate to obtain a more stable gradient. In addition, from the perspective of data enhancement, transforming the input samples in the spatial domain during the generation of confrontation samples is proposed. Unlike existing methods, this method updates the original gradient by weighting the gradients of different transformations to improve the mobility of confrontation samples. The experimental results show that the combination of the improved adaptive moment estimation and spatial-domain transformation gradient weighting algorithms can effectively improve the attack accuracy and mobility of adversarial samples. The white box attack success rate of the samples remains above 99.6%, while the black box attack success rate increases to 74.5%.

Key words: adversarial example, gradient optimization, moment estimation, image transformation, transferability, black box attack

张玉婷, 向海昀, 李倩, 廖浩德. 基于稳定Adam和空间域变换的对抗样本生成算法[J]. 计算机工程, 2024, 50(1): 251-258.

Yuting ZHANG, Haiyun XIANG, Qian LI, Haode LIAO. Adversarial Example Generation Algorithm Based on Stable Adam and Space Domain Transformation[J]. Computer Engineering, 2024, 50(1): 251-258.

http://www.ecice06.com/CN/Y2024/V50/I1/251

图/表 8

图1 NABD-STW-NIM算法流程

Fig.1 Procedure of NABD-STW-NIM algorithm

图2 不同攻击方法生成的对抗样本

Fig.2 Adversarial samples generated by different attack methods

图3 各参数与攻击成功率的关系

Fig.3 Relationship between each parameters and attack success rate

图4 不同扰动大小生成的对抗样本

Fig.4 Adversarial samples generated by different disturbance sizes

参考文献 27

1	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1409.1556.pdf.
2	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2980-2988.
3	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2015: 3431-3440.
4	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834- 848. doi: 10.1109/TPAMI.2017.2699184
5	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1312.6199.pdf.
6	XIONG Z, XU H, LI W, et al. Multi-source adversarial sample attack on autonomous vehicles. IEEE Transactions on Vehicular Technology, 2021, 70(3): 2822- 2835. doi: 10.1109/TVT.2021.3061065
7	SHEN M, YU H, ZHU L H, et al. Effective and robust physical-world attacks on deep learning face recognition systems. IEEE Transactions on Information Forensics and Security, 2021, 16, 4063- 4077. doi: 10.1109/TIFS.2021.3102492
8	AKHTAR N, MIAN A. Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access, 2018, 6, 14410- 14430. doi: 10.1109/ACCESS.2018.2807385
9	姜妍, 张立国. 面向深度学习模型的对抗攻击与防御方法综述. 计算机工程, 2021, 47(1): 1- 11. URL
	JIANG Y, ZHANG L G. Survey of adversarial attacks and defense methods for deep learning model. Computer Engineering, 2021, 47(1): 1- 11. URL
10	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1412.6572.pdf.
11	KURAKIN A, GOODFELLOW I J, BENGIO S. Adversarial examples in the physical world[M]. [S. 1. ]: CRC Press, 2018.
12	MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1706.06083.pdf.
13	CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks[C]//Proceedings of IEEE Symposium on Security and Privacy. Washington D. C., USA: IEEE Press, 2017: 39-57.
14	XIE C H, ZHANG Z S, ZHOU Y Y, et al. Improving transferability of adversarial examples with input diversity[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 2730-2739.
15	DONG Y P, PANG T Y, SU H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 4312-4321.
16	丁佳, 许智武. 基于Rectified Adam和颜色不变性的对抗迁移攻击. 软件学报, 2022, 33(7): 2525- 2537.
	DING J, XU Z W. Transfer-based adversarial attack with Rectified Adam and color invariance. Journal of Software, 2022, 33(7): 2525- 2537.
17	HANG J, HAN K J, CHEN H, et al. Ensemble adversarial black-box attacks against deep learning systems. Pattern Recognition, 2020, 101, 107184. doi: 10.1016/j.patcog.2019.107184
18	DONG Y P, LIAO F Z, PANG T Y, et al. Boosting adversarial attacks with momentum[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 9185-9193.
19	NESTEROV Y. A method for unconstrained convex minimization problem with the rate of convergence O (1/k²). Doklady Akademii Nauk SSSR, 1983, 269, 543- 547.
20	ZHUANG J T, TANG T, DING Y F, et al. AdaBelief optimizer: adapting stepsizes by the belief in observed gradients[EB/OL]. [2022-11-05]. https://arxiv.org/abs/2010.07468.pdf.
21	KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1412.6980.pdf.
22	RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211- 252. doi: 10.1007/s11263-015-0816-y
23	LIN J D, SONG C B, HE K, et al. Nesterov accelerated gradient and scale invariance for adversarial attacks[EB/OL]. [2022-11-05]. https://arxiv.org/abs/1908.06281.pdf.
24	ROBBINS H, MONRO S. A stochastic approximation method. The Annals of Mathematical Statistics, 1951, 22(3): 400- 407. doi: 10.1214/aoms/1177729586
25	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 2818-2826.
26	SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning. Artificial Intelligence, 2017, 31(1): 4278- 4284.
27	HE K M, ZHANG X Y, REN S Q, et al. Identity mappings in deep residual networks[C]//Proceedings of the 14th European Conference on Computer Vision. Berlin: Germany: Springer, 2016: 630-645.

[1]	李哲铭, 王晋东, 侯建中, 李伟, 张世华, 张恒巍. 基于显著区域优化的对抗样本攻击方法[J]. 计算机工程, 2023, 49(9): 246-255, 264.
[2]	杨燕燕, 谢明轩, 曹江峡, 王学宾, 柳厅文, 杜彦辉. 基于原型网络的中文分类模型对抗样本生成[J]. 计算机工程, 2023, 49(8): 54-62.
[3]	白祉旭, 王衡军. 基于改进遗传算法的对抗样本生成方法[J]. 计算机工程, 2023, 49(5): 139-149.
[4]	王春东, 孙嘉琪, 杨文军. 基于矫正理解的中文文本对抗样本生成方法[J]. 计算机工程, 2023, 49(2): 37-45.
[5]	王飞宇, 张帆, 杜加玉, 类红乐, 祁晓峰. 基于图像降噪与压缩的对抗样本检测方法[J]. 计算机工程, 2023, 49(10): 230-238.
[6]	谢云旭, 吴锡, 彭静. 无锚框模型类梯度全局对抗样本生成[J]. 计算机工程, 2023, 49(10): 186-193.
[7]	郑德生, 陈继鑫, 周静, 柯武平, 陆超, 周永, 仇钎. 基于输入通道拆分的对抗攻击迁移性增强算法[J]. 计算机工程, 2023, 49(1): 130-137.
[8]	杨文雪, 吴非, 郭桐, 肖利民. 基于噪声溶解的对抗样本防御方法[J]. 计算机工程, 2022, 48(4): 158-164.
[9]	李哲铭, 张恒巍, 马军强, 王晋东, 杨博. 基于平移随机变换的对抗样本生成方法[J]. 计算机工程, 2022, 48(11): 152-160,183.
[10]	陈晓楠, 胡建敏, 张本俊, 陈爱玲. 基于模型间迁移性的黑盒对抗攻击起点提升方法[J]. 计算机工程, 2021, 47(8): 162-169.
[11]	廖俊帆, 顾益军, 张培晶, 廖茜. 端到端说话人辨认的对抗样本应用比较研究[J]. 计算机工程, 2021, 47(6): 132-141.
[12]	蔡李美, 李新福, 田学东. 基于分层图像融合的虚拟视点绘制算法[J]. 计算机工程, 2021, 47(4): 204-210.
[13]	赖妍菱, 石峻峰, 陈继鑫, 白汉利, 唐晓澜, 邓碧颖, 郑德生. 基于U-Net的对抗样本防御模型[J]. 计算机工程, 2021, 47(12): 163-170.
[14]	王晓鹏, 罗威, 秦克, 杨锦涛, 王敏. 一种针对快速梯度下降对抗攻击的防御方法[J]. 计算机工程, 2021, 47(11): 121-128.
[15]	黄静琪, 贾西平, 陈道鑫, 柏柯嘉, 廖秀秀. 基于双对抗机制的图像攻击算法[J]. 计算机工程, 2021, 47(11): 150-157.

选择文件类型/文献管理软件名称

选择包含的内容