作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (2): 180-185,193. doi: 10.19678/j.issn.1000-3428.0060739

• 图形图像处理 • 上一篇    下一篇

基于多通道注意力机制的人脸替换鉴别

武茜, 贾世杰   

  1. 大连交通大学 电气信息工程学院, 辽宁 大连 116028
  • 收稿日期:2021-01-29 修回日期:2021-03-01 发布日期:2021-03-02
  • 作者简介:武茜(1994-),女,硕士,主研方向为深度学习、换脸鉴别;贾世杰,教授、博士。
  • 基金资助:
    辽宁省教育厅科学研究项目(JDL2019006)。

Face Swapping Detection Based on Multi-Channel Attention Mechanism

WU Qian, JIA Shijie   

  1. School of Electrical Information Engineering, Dalian Jiaotong University, Dalian, Liaoning 116028, China
  • Received:2021-01-29 Revised:2021-03-01 Published:2021-03-02

摘要: 基于深度学习的人脸替换技术取得快速发展,但由DeepFake自动生成的人脸替换图片有可能危害人们的隐私安全。针对DeepFake图片鉴别问题,建立一种基于多通道注意力机制的深度学习鉴别网络模型。将Xception网络作为基础特征提取器,在多通道注意力模块中通过矩阵相乘的思想融合全局和局部的注意力表示,以减少重要信息损失。设计损失函数时添加中心损失,从而提高特征区分度。在训练过程中利用注意力图来引导训练图像的裁剪和去除,以达到数据增强的目的。实验结果表明,相比Xception、B4Att方法,在FaceForensics++数据集上该网络模型对DeepFake的检测精度分别提高0.77和0.45个百分点,在Celeb-DF数据集上分别提高5.30和4.68个百分点。

关键词: 人脸替换, 多通道注意力机制, 图片鉴别, Xception网络, 深度学习

Abstract: In recent years, the face swapping technology based on deep learning has developed rapidly, but the images of face swapping automatically generated by DeepFake may endanger people's privacy and security.To detection the DeepFake images, a deep learning-based network model using the multi-channel attention mechanism is designed.The model employs Xception network as the basic feature extractor, and the idea of matrix multiplication is used to combine global and local attention representations in the multi-channel attention module to reduce information loss.Then a center loss is introduced into the design of the loss function, so the feature discrimination can be improved.At the same time, in the training process, the attention map is used to guide the cropping and erasing of training images to achieve data enhancement.The experimental results show that the detection accuracy of the network model for DeepFake is 0.77 percentage points higher than that of Xception, and 0.45 percentage points higher than that of B4Att on the FaceForensics++ dataset.The accuracy of the proposed model is 5.30 percentage points higher than that of Xception, and 4.68 percentage points higher than that of B4Att on Celeb-DF dataset.

Key words: face swapping, multi-channel attention mechanism, image detection, Xception network, deep learning

中图分类号: