Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (3): 62-78. doi: 10.19678/j.issn.1000-3428.0070128

• Frontier Perspectives and Reviews • Previous Articles     Next Articles

Survey of Deep Learning Backdoor Attack on Image Data

WANG Renshuai1,2, YANG Kuiwu2, CHEN Yue2,*(), WANG Wen2, WEI Jianghong2   

  1. 1. School of Cyber Security, Zhengzhou University, Zhengzhou 450002, Henan, China
    2. The PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, Henan, China
  • Received:2024-07-16 Revised:2024-09-18 Online:2026-03-15 Published:2026-03-10
  • Contact: CHEN Yue

面向图像数据的深度学习后门攻击技术综述

王人帅1,2, 杨奎武2, 陈越2,*(), 王雯2, 魏江宏2   

  1. 1. 郑州大学网络空间安全学院, 河南 郑州 450002
    2. 中国人民解放军战略支援部队信息工程大学, 河南 郑州 450001
  • 通讯作者: 陈越
  • 作者简介:

    王人帅, 男, 硕士, 主研方向为深度学习、大数据安全

    杨奎武, 副教授、博士

    陈越(通信作者), 教授、博士

    王雯, 硕士

    魏江宏, 讲师、博士

  • 基金资助:
    国家自然科学基金(62172433); 国家自然科学基金(62172434)

Abstract:

The in-depth exploration of backdoor attacks in the field of deep learning is important for the security and robustness of deep learning models. With the widespread application of deep learning technology, the use of third-party data and pre-trained models has become common; however, this poses potential security threats. Researchers have found that malicious codes or hidden backdoors may be introduced into a model via unverified third-party resources and may be activated under specific conditions, leading to abnormal model behavior. Currently, backdoor attack methods in the field of imaging are constantly being developed; however, systematic reviews that comprehensively introduce backdoor attack techniques in the field of imaging are rare. To this end, the concepts and basic attack processes of backdoor attacks are introduced in this study. Subsequently, the differences between backdoor and adversarial attacks, as well as data poisoning attacks, are analyzed. Additionally, backdoor attack techniques in the imaging field are classified based on seven aspects: triggers, fusion strategies, target categories, model structure modifications, model weight modifications, code poisoning, and data sorting. The evolution of backdoor attack techniques is discussed, and the characteristics, performance, advantages, and disadvantages of the different techniques are analyzed. On this basis, the results of the present study are summarized and possible future research directions are analyzed from multiple perspectives, emphasizing the importance of building safe and reliable deep learning models.

Key words: backdoor attack, AI security, pre-trained model, image data, neural network

摘要:

深入探讨深度学习领域的后门攻击问题, 该问题是一个涉及深度学习模型安全性和鲁棒性的重要议题。随着深度学习技术的广泛应用, 使用第三方数据和预训练模型变得普遍, 但这也带来了潜在的安全威胁。研究人员发现, 通过未经验证的第三方资源, 恶意代码或隐藏后门可能被引入模型中, 它们可能在特定条件下被激活, 导致模型行为异常。目前, 图像领域的后门攻击方法不断发展, 但缺乏系统性的综述来全面介绍图像领域的后门攻击技术。为此, 首先介绍后门攻击的概念和基本的攻击流程, 然后分析后门攻击和对抗攻击、数据投毒攻击2种相关安全威胁的区别, 随后从不同触发器、不同融合策略、不同目标类别、模型结构修改、模型权重修改、代码中毒、数据排序等7个方面对图像领域的后门攻击技术进行归类, 介绍后门攻击技术的演进, 分析其特点、性能以及不同技术的优缺点。在此基础上, 总结目前的研究成果, 并从多个角度对未来可能的研究方向进行分析和展望, 强调构建安全、可靠的深度学习模型的重要性。

关键词: 后门攻击, AI安全, 预训练模型, 图像数据, 神经网络