基于噪声溶解的对抗样本防御方法

doi:10.19678/j.issn.1000-3428.0061470

摘要/Abstract

摘要： 深度神经网络在发展过程中暴露出的对抗攻击等安全问题逐渐引起了人们的关注和重视。然而，自对抗样本的概念提出后，针对深度神经网络的对抗攻击算法大量涌现，而深度神经网络自身的复杂性和不可解释性增大了防御攻击的难度。为了保证防御方法的普适性，以预处理方法为基本思路，同时结合对抗样本自身的特异性，提出一种新的对抗样本防御方法。考虑对抗攻击的隐蔽性和脆弱性，利用深度学习模型的鲁棒性，通过噪声溶解过程降低对抗扰动的攻击性和滤波容忍度。在滤波过程中，以对抗噪声贡献为依据自适应调整滤波范围及强度，有针对性地滤除对抗噪声，该方法不需要对现有深度学习模型进行修改和调整，且易于部署。实验结果表明，在ImageNet数据集下，该方法对经典对抗攻击方法L-BFGS、FGSM、Deepfool、JSMA及C&W的防御成功率均保持在80%以上，与JPEG图像压缩、APE-GAN以及图像分块去噪经典预处理防御方法相比，防御成功率分别提高9.25、14.86及14.32个百分点以上，具有较好的防御效果，且普适性强。

关键词: 深度神经网络, 对抗样本, 乘性噪声, 类激活映射, 自适应滤波

Abstract: The security problems exposed in the rapid development of the Deep Neural Network(DNN) have gradually attracted our attention.However, since adversarial examples were first defined, many adversarial attacks on DNNs have been proposed, and the complexity and weak interpretability of DNNs increases their vulnerability to these attacks.To ensure the universality of our defense methods, in this paper, we propose a defense method against adversarial attacks based on the dissolution of noise.The proposed method takes pre-processing as the basic idea and combines it with the specificity of adversarial examples.Considering the stealthiness and vulnerability of adversarial attacks, we design the process of noise dissolution to destroy the aggressivity and the filtering tolerability of adversarial disturbance, taking advantage of the robustness of DNN.In the subsequent filtering process, we adaptively adjust the filtering range and intensity based on adversarial disturbance contribution and targeted filter adversarial noise.Our method is easy to deploy without modifying DNN.And the experiment results show that the defense success rate on the ImageNet dataset of our method against the classical adversarial attacks L-BFGS, FGSM, Deepfool, JSMA, and C&W is above 80%, and is 9.25, 14.86 and 14.32 percentage point higher than the classical pre-processing defense methods JPEG compression, APE-GAN, and D3, respectively.Our method has a good defense effect and strong universality.

Key words: Deep Neural Network(DNN), adversarial examples, multiplicative noise, class activation mapping, adaptive filtering

中图分类号:

TP391.41

杨文雪, 吴非, 郭桐, 肖利民. 基于噪声溶解的对抗样本防御方法[J]. 计算机工程, 2022, 48(4): 158-164.

YANG Wenxue, WU Fei, GUO Tong, XIAO Limin. Adversarial Sample Defense Method Based on Noise Dissolution[J]. Computer Engineering, 2022, 48(4): 158-164.

https://www.ecice06.com/CN/Y2022/V48/I4/158

图/表 11

20230131201342

20230131201345

20230131201348

20230131201351

20230131201354

20230131201357

20230131201400

20230131201405

20230131201408

20230131201412

20230131201415

参考文献

[1] LECUN Y, DENKER J S, HENDERSON D, et al.Handwritten digit recognition with a back-propagation network[C]//Proceedings of IEEE Advances in Neural Information Processing Systems.Washington D.C., USA:IEEE Press, 1990:396-404.
[2] COLLOBERT R, WESTON J, BOTTOU L, et al.Natural language processing (almost) from scratch[J].Journal of Machine Learning Research, 2011, 12(8):2493-2537.
[3] SAADNA Y, BEHLOUL A.An overview of traffic sign detection and classification methods[J].International Journal of Multimedia Information Retrieval, 2017, 6(3):193-210.
[4] HUANG G B, LEE H, LEARNED-MILLER E.Learning hierarchical representations for face verification with convolutional deep belief networks[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2012:2518-2525.
[5] ZHENG S, SONG Y, LEUNG T, et al.Improving the robustness of deep neural networks via stability training[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:4480-4488.
[6] MOOSAVI-DEZFOOLI S M, FAWZI A, FAWZI O, et al.Universal adversarial perturbations[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:86-94.
[7] PAPERNOT N, MCDANIEL P, WU X, et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//Proceedings of 2016 IEEE Symposium on Security and Privacy.San Jose, USA:IEEE Press, 2016:582-597.
[8] CARLINI N, WAGNER D.Defensive distillation is not robust to adversarial examples[EB/OL].[2021-07-22].https://arxiv.org/abs/1607.04311.
[9] GAO J, WANG B, LIN Z, et al.DeepCloak:masking deep neural network models for robustness against adversarial samples[EB/OL].[2021-07-22].https://arxivpreprintarxiv:1702.06763.
[10] JIN G Q, SHEN S W, ZHANG D M, et al.APE-GAN:adversarial perturbation elimination with GAN[C]//Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing.Washington D.C., USA:IEEE Press, 2019:3842-3846.
[11] SAMANGOUEI P, KABKAB M, CHELLAPPA R.Defense-GAN:protecting classifiers against adversarial attacks using generative models[EB/OL].[2021-07-22].https://arxiv.org/abs/1805.06605.
[12] DZIUGAITE G K, GHAHRAMANI Z, ROY D M.A study of the effect of JPG compression on adversarial images[EB/OL].[2021-07-22].https://arxiv.org/abs/1608.00853.
[13] MOOSAVI-DEZFOOLI S M, SHRIVASTAVA A, TUZEL O.Divide, denoise, and defend against adversarial attacks[EB/OL].[2021-07-22].https://arxiv.org/abs/1802.06806.
[14] TANAY T, GRIFFIN L.A boundary tilting persepective on the phenomenon of adversarial examples[EB/OL].[2021-07-22].https://arxiv.org/abs/1608.07690.
[15] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al.Intriguing properties of neural networks[EB/OL].[2021-07-22].https://arxivpreprintarxiv:1312.6199.
[16] GOODFELLOW I J, SHLENS J, SZEGEDY C.Explaining and harnessing adversarial examples[EB/OL].[2021-07-22].https://arxivpreprintarxiv:1412.6572.
[17] KURAKIN A, GOODFELLOW I, BENGIO S.Adversarial examples in the physical world[EB/OL].[2021-07-22].https://arxivpreprintarxiv:1607.02533.
[18] PAPERNOT N, MCDANIEL P, JHA S, et al.The limitations of deep learning in adversarial settings[C]//Proceedings of IEEE European Symposium on Security and Privacy.Berlin, Germany:Springer, 2016:372-387.
[19] MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P.DeepFool:a simple and accurate method to fool deep neural networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, USA:IEEE Press, 2016:2574-2582.
[20] CARLINI N, WAGNER D.Towards evaluating the robustness of neural networks[C]//Proceedings of IEEE Symposium on Security and Privacy.San Jose, USA:IEEE Press, 2017:39-57.
[21] 李康.人工智能系统实现中的安全风险[C]//2018第八届中国人工智能与安全专题论文集.成都:[出版者不详], 2018:231-242. LI K.Security risks in implementation of artificial intelligence systems[C]//Proceedings of the 8th Symposium on Artificial Intelligence and Security.Chengdu, China:[s.n.], 2018:231-242.(in Chinese)
[22] DIAMOND S, SITZMANN V, BOYD S, et al.Dirty pixels:optimizing image classification architectures for raw sensor data[EB/OL].[2021-07-22].https://arxivpreprintarxiv:1701.06487.
[23] 杨雪, 李婷, 杨超琼, 等.随机共振在图像处理中的研究综述[J].图像与信号处理, 2015, 4(4):132-138. YANG X, LI T, YANG C Q, et al.Review of research on image processing using stochastic resonance[J].Journal of Image and Signal Processing, 2015, 4(4):132-138.(in Chinese)
[24] ZHOU B L, KHOSLA A, LAPEDRIZA A, et al.Learning deep features for discriminative localization[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, USA:IEEE Press, 2016:2921-2929.
[25] SELVARAJU R R, COGSWELL M, DAS A, et al.Grad-CAM:visual explanations from deep networks via gradient-based localization[J].International Journal of Computer Vision, 2020, 128(2):336-359.
[26] DENG J, DONG W, SOCHER R, et al.ImageNet:a large-scale hierarchical image database[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2009:248-255.
[27] SZEGEDY C, VANHOUCKE V, IOFFE S, et al.Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, USA:IEEE Press, 2016:2818-2826.
[28] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2021-07-22].https://arxiv.org/abs/1409.1556.
[29] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, USA:IEEE Press, 2016:770-778.

选择文件类型/文献管理软件名称

选择包含的内容