Adversarial Examples Detection Method Based on Image Denoising and Compression

doi:10.19678/j.issn.1000-3428.0065638

Abstract

Abstract:

Numerous deep learning achievements in the field of computer vision have been widely applied in real life. However, adversarial examples can lead to false positives in deep learning models with high confidence, resulting in serious security consequences. Adversarial examples detection methods generally suffer from problems, such as high computational costs or dependence on example statistical characteristics. In this paper, a new adversarial example detection method based on prediction inconsistency is proposed. Considering adversarial disturbances as unnecessary features, image denoising or compression techniques are used to compress the feature space of the example, thereby reducing adversarial disturbances. The classification results of normal examples before and after feature space compression in deep learning models usually differ slightly, while the classification results of adversarial example before and after the same processing differ significantly. By measuring the distance between the predicted results of the original input and the predicted results of the compressed feature space in the deep learning model, adversarial attacks are detected. If the distance is greater than the threshold, the input is adversarial. The selection of the training set for the proposed detection method is independent of adversarial example and does not require adjustments to the original deep learning model. The experimental results show that the proposed method can effectively detect classic attacks such as Fast Gradient Sign Method (FGSM), JSMA, and C&W attack while ensuring a low false positive rate. The average detection rates on the MNIST and CIFAR-10 datasets reached as high as 99.77% and 87.90%, respectively.

Key words: deep learning, adversarial examples, adversarial examples detection, image denoising, image compression

摘要：

深度学习在计算机视觉领域的许多成果已广泛应用于现实生活。然而，对抗样本能够让深度学习模型以高置信度产生误判，进而造成严重的安全后果，同时对抗样本检测方法普遍存在计算成本高或依赖样本统计特性等问题。为此，提出一种基于预测不一致的对抗样本检测方法。若将对抗扰动视作不必要的特征，通过图像降噪或压缩技术来压缩样本的特征空间，从而减少对抗扰动。通常压缩特征空间前后的正常样本在深度学习模型中的分类结果差别较小，而相同处理前后对抗样本的分类结果差别较大。通过测量深度学习模型对原输入的预测结果与压缩特征空间后输入预测结果之间的距离来检测对抗攻击，若其大于阈值，则该输入具有对抗性。该检测方法的训练集选取与对抗样本无关，而且无须对原深度学习模型进行调整。实验结果表明，该方法在保证较低假阳性率的同时，能够对快速梯度符号法（FGSM）、JSMA和C&W等经典攻击进行有效检测，在MNIST和CIFAR-10数据集上的平均检测率高达99.77%和87.90%。

关键词: 深度学习, 对抗样本, 对抗样本检测, 图像降噪, 图像压缩

Feiyu WANG, Fan ZHANG, Jiayu DU, Hongle LEI, Xiaofeng QI. Adversarial Examples Detection Method Based on Image Denoising and Compression[J]. Computer Engineering, 2023, 49(10): 230-238.

王飞宇, 张帆, 杜加玉, 类红乐, 祁晓峰. 基于图像降噪与压缩的对抗样本检测方法[J]. 计算机工程, 2023, 49(10): 230-238.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0065638

http://www.ecice06.com/EN/Y2023/V49/I10/230

Figures/Tables 12

Fig.1 Auto-encoder structures in this paper

Fig.2 Executive process of non-local means smoothing

Fig.3 Image examples with pixel depth compression

Fig.4 Detection procedure of the proposed method

Fig.5 Influence of Dropout on denoising auto-encoders

Fig.6 Effect imaging of the MNIST examples before and after image denoising or image compression

Fig.7 Effect imaging of the CIFAR-10 examples before and after image denoising or image compression

References 31

1	LECUN Y , BENGIO Y , HINTON G E . Deep learning. Nature, 2015, 521 (7553): 436- 444. doi: 10.1038/nature14539
2	OTTER D W , MEDINA J R , KALITA J K . A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32 (2): 604- 624. doi: 10.1109/TNNLS.2020.2979670
3	BOU NASSIF A , SHAHIN I , ATTILI I , et al. Speech recognition using deep neural networks: a systematic review. IEEE Access, 2019, 7, 19143- 19165. doi: 10.1109/ACCESS.2019.2896880
4	KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks. Communications of the ACM, 2017, 60 (6): 84- 90. doi: 10.1145/3065386
5	EYKHOLT K, EVTIMOV I, FERNANDES E, et al. Robust physical-world attacks on deep learning visual classification[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 1625-1634.
6	HAO S Y, HAO G. Research on OCT image processing based on deep learning[C]//Proceedings of the 10th International Conference on Electronics Information and Emergency Communication. Washington D.C., USA: IEEE Press, 2020: 208-212.
7	陈宇飞, 沈超, 王骞, 等. 人工智能系统安全与隐私风险. 计算机研究与发展, 2019, 56 (10): 2135- 2150. doi: 10.7544/issn1000-1239.2019.20190415
	CHEN Y F , SHEN C , WANG Q , et al. Security and privacy risks in artificial intelligence systems. Journal of Computer Research and Development, 2019, 56 (10): 2135- 2150. doi: 10.7544/issn1000-1239.2019.20190415
8	纪守领, 杜天宇, 李进锋, 等. 机器学习模型安全与隐私研究综述. 软件学报, 2021, 32 (1): 41- 67. URL
	JI S L , DU T Y , LI J F , et al. Security and privacy of machine learning models: a survey. Journal of Software, 2021, 32 (1): 41- 67. URL
9	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[EB/OL]. [2022-07-25]. http://arxiv preprint arxiv:1312.6199, 2013.
10	AKHTAR N , MIAN A . Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access, 2018, 6, 14410- 14430. doi: 10.1109/ACCESS.2018.2807385
11	段广晗, 马春光, 宋蕾, 等. 深度学习中对抗样本的构造及防御研究. 网络与信息安全学报, 2020, 6 (2): 1- 11. URL
	DUAN G H , MA C G , SONG L , et al. Research on the construction and defense of antagonistic samples in deep learning. Chinese Journal of Network and Information Security, 2020, 6 (2): 1- 11. URL
12	METZEN J H, GENEWEIN T, FISCHER V, et al. On detecting adversarial perturbations[EB/OL]. [2022-07-25]. https://arxiv.org/abs/1702.04267.
13	LIU J Y, ZHANG W M, ZHANG Y W, et al. Detection based defense against adversarial examples from the steganalysis point of view[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2020: 4820-4829.
14	COHEN G, SAPIRO G, GIRYES R. Detecting adversarial samples using influence functions and nearest neighbors[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2020: 14441-14450.
15	潘文雯, 王新宇, 宋明黎, 等. 对抗样本生成技术综述. 软件学报, 2020, 31 (1): 67- 81. URL
	PAN W W , WANG X Y , SONG M L , et al. Survey on generating adversarial examples. Journal of Software, 2020, 31 (1): 67- 81. URL
16	张思思, 左信, 刘建伟. 深度学习中的对抗样本问题. 计算机学报, 2019, 42 (8): 1886- 1904. URL
	ZHANG S S , ZUO X , LIU J W . The problem of the adversarial examples in deep learning. Chinese Journal of Computers, 2019, 42 (8): 1886- 1904. URL
17	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL]. [2022-07-25]. https://arxiv.org/pdf/1412.6572.pdf.
18	KURAKIN A, GOODFELLOW I J, BENGIO S. Adversarial machine learning at scale[EB/OL]. [2022-07-25]. https://arxiv.org/pdf/1611.01236.pdf.
19	CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks[C]//Proceedings of Symposium on Security and Privacy. Washington D.C., USA: IEEE Press, 2017: 39-57.
20	MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2016: 2574-2582.
21	PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//Proceedings of European Symposium on Security and Privacy. Berlin, Germany: Springer, 2016: 372-387.
22	PAPERNOT N, MCDANIEL P, WU X, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C]//Proceedings of Symposium on Security and Privacy. Washington D.C., USA: IEEE Press, 2016: 582-597.
23	PRAKASH A, MORAN N, GARBER S, et al. Deflecting adversarial attacks with pixel deflection[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 8571-8580.
24	MA X J, LI B, WANG Y S, et al. Characterizing adversarial subspaces using local intrinsic dimensionality[EB/OL]. [2022-07-25]. https://arxiv.org/pdf/1801.02613.pdf.
25	ZUO F, ZENG Q. Exploiting the sensitivity of L2 adversarial examples to erase-and-restore[C]//Proceedings of ACM Asia Conference on Computer and Communications Security. New York, USA: ACM Press, 2021: 40-51.
26	XU W L, EVANS D, QI Y J. Feature squeezing: detecting adversarial examples in deep neural networks[C]//Proceedings of Network and Distributed System Security Symposium. Washington D.C., USA: IEEE Press, 2018: 1-10.
27	MASCI J, MEIER U, CIREŞAN D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction[C]//Proceedings of International Conference on Artificial Neural Networks. Berlin, Germany: Springer, 2011: 52-59.
28	VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the 25th International Conference on Machine Learning. New York, USA: ACM Press, 2008: 1096-1103.
29	CARLINI N. Robust evasion attacks against neural network to find adversarial examples[EB/OL]. [2022-07-25]. https://github.com/carlini/nn_robust_attacks/.
30	MAJUMDAR S. DenseNet implementation in Keras[EB/OL]. [2022-07-25]. https://github.com/titu1994/DenseNet/.
31	MENG D Y, CHEN H. MagNet: a two-pronged defense against adversarial examples[C]//Proceedings of ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM Press, 2017: 135-147.

[1]	Mengxu ZHU, Wenhao ZHANG, Guohong LI, Xingfa GU, Tao YU, Fengjie ZHENG, Lili ZHANG, Yu WU, Fangfei BING, Jianxiong TANG. GF-6 Multispectral Image Compression Based on Convolutional Neural Network [J]. Computer Engineering, 2023, 49(9): 287-294.
[2]	Zheming LI, Jindong WANG, Jianzhong HOU, Wei LI, Shihua ZHANG, Hengwei ZHANG. Adversarial Example Attack Method Based on Salient Region Optimization [J]. Computer Engineering, 2023, 49(9): 246-255, 264.
[3]	Yaping CHI, Ziyan YUE, Yuheng LIN. Working Mode Recognition for SM4 Algorithm Based on Transformer [J]. Computer Engineering, 2023, 49(9): 109-117.
[4]	Zhonglin LIN, Jinqiao SHI, Meiqi WANG, Xuebin WANG, Yuyan WANG. Android Malware Application Detection Technology Based on the Application Behavior Division [J]. Computer Engineering, 2023, 49(9): 125-136.
[5]	Zeshui LI, Junzhong JI, Cuicui YANG. Functional Module Detection Based on Deep Network Embedding of Edge Weighing Information in PPIN [J]. Computer Engineering, 2023, 49(8): 69-76.
[6]	Kezheng WANG, Yufeng XU, Shangbo ZHOU. Image Dehazing Model Combined with Contrastive Perceptual Loss and Fusion Attention [J]. Computer Engineering, 2023, 49(8): 207-214.
[7]	Junhao LIU, Meilin WANG, Xing XIE, Yexing SONG, Lihua XU. Leather Defect Detection Algorithm Based on Improved YOLOv5 [J]. Computer Engineering, 2023, 49(8): 240-249.
[8]	Yuyan JIANG, Chengfeng TAO, Ping LI. Deep Subspace Clustering Algorithm with Data Augmentation and Adaptive Self-Paced Learning [J]. Computer Engineering, 2023, 49(8): 96-103, 110.
[9]	Junxia LI, Xingchi WANG, Zi YIN, Deshuo SHI. Weakly Supervised Salient Object Detection via Edge Depth Mining [J]. Computer Engineering, 2023, 49(7): 169-178.
[10]	Shan WU, Feng ZHOU. Small Target Detection Based on Improved SSD Algorithm [J]. Computer Engineering, 2023, 49(7): 179-188.
[11]	Jianrui XI, Hongmei TANG, Chunyang LIANG, Xin LIU. Point Cloud Object Reconstruction Based on Improved Implicit Function [J]. Computer Engineering, 2023, 49(7): 214-222.
[12]	Yongsheng QI, Xiaoxu DU, Junfeng ZHU, Shengli GAO, Liqiang LIU. Efficient Livestock Detection in Grazing Areas Based on Enhanced Lightweight Deep Network [J]. Computer Engineering, 2023, 49(7): 278-287.
[13]	Xingya YAN, Yaxi KUANG, Guangrui BAI, Yue LI. Student Classroom Behavior Recognition Method Based on Deep Learning [J]. Computer Engineering, 2023, 49(7): 251-258.
[14]	ZHANG Boxu, PU Zhi, CHENG Xi. Research on Uyghur Text Classification Based on Prompt Learning [J]. Computer Engineering, 2023, 49(6): 292-299,313.
[15]	YU Haiyang, JING Peng, ZHANG Wentao, XIE Saifei, HUA Zhihua, SONG Caoyuan. Improved U-Net Model for Road Crack Detection Based on Residual and Attention Mechanism [J]. Computer Engineering, 2023, 49(6): 265-273.

Please choose a citation manager

Content to export