作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (10): 100-109. doi: 10.19678/j.issn.1000-3428.0068106

• 人工智能与模式识别 • 上一篇    下一篇

基于生成对抗网络的深度伪造跨模型防御方法

戴磊, 曹林, 郭亚男, 张帆*(), 杜康宁   

  1. 北京信息科技大学信息与通信工程学院, 北京 100101
  • 收稿日期:2023-07-19 出版日期:2024-10-15 发布日期:2024-03-06
  • 通讯作者: 张帆
  • 基金资助:
    国家自然科学基金(U20A20163); 国家自然科学基金(62001033); 国家自然科学基金(62201066); 北京市教委科研计划(KZ202111232049); 北京市教委科研计划(KM202111232014)

Deepfake Cross-Model Defense Method Based on Generative Adversarial Network

DAI Lei, CAO Lin, GUO Yanan, ZHANG Fan*(), DU Kangning   

  1. School of Information Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China
  • Received:2023-07-19 Online:2024-10-15 Published:2024-03-06
  • Contact: ZHANG Fan

摘要:

为了降低深度伪造技术滥用带来的社会风险, 提出一种基于生成对抗网络的主动防御深度伪造方法, 通过在原始图像上增加微弱扰动制作对抗样本, 使多个伪造模型输出产生明显失真。提出模型由对抗样本生成模块和对抗样本优化模块组成。对抗样本生成模块包括生成器和鉴别器, 生成器在接收原始图像生成扰动后, 通过对抗训练约束扰动的空间分布, 降低扰动的视觉感知, 提高对抗样本的真实性; 对抗样本优化模块由基础对抗水印、深度伪造模型和鉴别器等组成, 通过模拟黑盒场景攻击多个深度伪造模型, 提高对抗样本的攻击性和迁移性。在常用深度伪造数据集CelebA和LFW上进行训练和测试, 实验结果表明, 相比现有主动防御方法, 提出方法在实现跨模型主动防御的基础上, 防御成功率达到85%以上, 并且对抗样本生成效率比传统算法提高20~30倍。

关键词: 深度伪造, 对抗样本, 主动防御, 生成对抗网络, 迁移性

Abstract:

To reduce social risks caused by the abuse of deepfake technology, an active defense method against deep forgery based on a Generative Adversarial Network (GAN) is proposed. Adversarial samples are created by adding imperceptible perturbation to original images, which significantly distorts the output of multiple forgery models. The proposed model comprises an adversarial sample generation module and an adversarial sample optimization module. The adversarial-sample generation module includes a generator and discriminator. After the generator receives an original image to generate a perturbation, the spatial distribution of the perturbation is constrained through adversarial training. By reducing the visual perception of the perturbation, the authenticity of the adversarial sample is improved. The adversarial sample optimization module comprises basic adversarial watermarking, deep forgery models, and discriminators. This module simulates black-box scenarios to attack multiple deep forgery models, thereby improving the attack and migration of adversarial samples. Training and testing are conducted on commonly used deepfake datasets Celebfaces Attributes (CelebA) and Labeled Faces in the Wild (LFW). Experimental results show that compared with existing active defense methods, the proposed method achieves a defense success rate exceeding 85% based on the cross-model active defense method and generates adversarial samples. Additionally, the method improves efficiency by 20-30 times compared with those of conventional algorithms.

Key words: deepfake, adversarial samples, active defense, Generative Adversarial Network(GAN), generalization