作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (5): 139-149. doi: 10.19678/j.issn.1000-3428.0065260

• 网络空间安全 • 上一篇    下一篇

基于改进遗传算法的对抗样本生成方法

白祉旭, 王衡军   

  1. 战略支援部队信息工程大学, 郑州 450001
  • 收稿日期:2022-07-15 修回日期:2022-08-22 发布日期:2022-09-29
  • 作者简介:白祉旭(1992-),男,硕士研究生,主研方向为人工智能安全;王衡军,副教授、博士。
  • 基金资助:
    国家重点研发计划(2017YFB0801904)。

Adversarial Example Generation Method Based on Improved Genetic Algorithm

BAI Zhixu, WANG Hengjun   

  1. Strategic Support Force Information Engineering University, Zhengzhou 450001, China
  • Received:2022-07-15 Revised:2022-08-22 Published:2022-09-29

摘要: 对抗样本是评估模型安全性和鲁棒性的有效工具,对模型进行对抗训练能有效提升模型的安全性。现有对抗攻击按主流分类方法可分为白盒攻击和黑盒攻击两类,其中黑盒攻击方法普遍存在攻击效率低、隐蔽性差等问题。提出一种基于改进遗传算法的黑盒攻击方法,通过在对抗样本进化过程中引入类间激活热力图解释方法,并对原始图像进行区域像素划分,将扰动进化限制在图像关键区域,以提升所生成对抗样本的隐蔽性。在算法中使用自适应概率函数与精英保留策略,提高算法的攻击效率,通过样本初始化、选择、交叉、变异等操作,在仅掌握模型输出标签及其置信度的情况下实现黑盒攻击。实验结果表明,与同是基于遗传算法的POBA-GA黑盒攻击方法相比,该方法在相同攻击成功率下生成的对抗样本隐蔽性更好,且生成过程中模型访问次数更少,隐蔽性平均提升7.14%,模型访问次数平均降低6.43%。

关键词: 对抗样本, 遗传算法, 热力图, 白盒攻击, 黑盒攻击

Abstract: The adversarial example is an effective tool to evaluate the security and robustness of a model.Conducting antagonism training on the model can effectively improve the model's security.The mainstream classification methods divide the existing counterattacks into white-box attack and black-box attack.Black-box attack methods generally have problems of low attack efficiency and poor concealment.Thus,a black-box attack method based on an improved Genetic Algorithm(GA) is proposed.a Class Activation Mapping(CAM) interpretation method is introduced in the process of adversarial sample evolution.In addition,the original image is divided into regional pixels to restrict the perturbation evolution to essential regions of the image to improve the concealment of the generated adversarial examples.An adaptive probability function is introduced with an elite retention strategy to improve the algorithm's attack efficiency and achieve black-box attacks by sample initialization,selection,crossover,and mutation operations with only the model output labels and their confidence levels.The experimental results show that,compared with the POBA-GA black-box attack method that also uses a genetic algorithm,the adversarial examples generated by the improved genetic algorithm-based adversarial sample generation method have better steganography and fewer model visits for the same attack success rate.The average increase of 7.14% and the average decrease of 6.43% in the number of model visits validate the method's effectiveness.

Key words: adversarial example, Genetic Algorithm(GA), heat map, white-box attack, black-box attack

中图分类号: