作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (6): 234-241. doi: 10.19678/j.issn.1000-3428.0064830

• 图形图像处理 • 上一篇    下一篇

基于注意力残差网络的人脸超分辨率重建

王同官, 赖惠成, 蔡玉玺, 高古学, 汪烈军   

  1. 新疆大学 信息科学与工程学院, 乌鲁木齐 830046
  • 收稿日期:2022-05-26 修回日期:2022-07-23 发布日期:2022-09-21
  • 作者简介:王同官(1999-),男,硕士研究生,主研方向为人脸超分辨率重建;赖惠成(通信作者),教授、博士生导师;蔡玉玺,硕士研究生;高古学,博士研究生;汪烈军,教授、博士、博士生导师。
  • 基金资助:
    国家自然科学基金(U1903213)。

Face Super-Resolution Reconstruction Based on Attention Residual Network

WANG Tongguan, LAI Huicheng, CAI Yuxi, GAO Guxue, WANG Liejun   

  1. College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
  • Received:2022-05-26 Revised:2022-07-23 Published:2022-09-21

摘要: 为解决通道内部特征信息交互性不足、特征利用和表示不够充分导致的人脸面部细节信息恢复不理想的问题,提出一种基于编码器-解码器的注意力残差网络,并设计基于注意力的残差模块,其主要由基准残差模块、沙漏模块与内部特征拆分注意力模块组成,通过内部特征拆分注意力模块加强通道内部之间的交互性,使网络能够提取到更详细的特征信息,恢复出更多人脸面部细节,同时在残差模块中利用一个预激活模块,解决批量归一化层在超分辨率网络中存在的伪影问题。在特征提取单元末端运用多阶特征融合模块充分融合多个阶段的特征,缓解特征在网络传输过程中的丢失现象,提高特征利用率。实验结果表明,该方法可以恢复出更多人脸面部细节,在Helen人脸数据集上,重建人脸图像的PSNR值为27.74 dB,相比SISN和DICNet方法,分别提高了1.47 dB、1.12 dB。在CelebA人脸数据集上,重建人脸图像的PSNR值为27.40 dB,相比SISN和DICNet方法,分别提高了1.26 dB、0.39 dB。

关键词: 人脸超分辨率, 注意力机制, 残差网络, 特征融合, 编码器, 解码器

Abstract: Insufficient interactivity of feature information inside the channel and insufficient feature utilization and representation leads to less than ideal recovery of facial information.To address the problem,this paper proposes an attentional residual network based on encoder-decoder.In particular,it proposes a new Residual Attention Block(RAB),which mainly consists of a baseline residual block,an hourglass block,and an internal-feature split attention block used to strengthen the interactivity between channel interiors.This enables the network to extract more detailed feature information and recover more facial details.In addition,in the proposed residual block,a pre-activation block is used to solve the artifact problem that occurs in the Batch Normalization(BN) layer in super-resolution networks.Finally,a multi-stage feature fusion block is used at the end of the feature extraction unit to fully fuse the features obtained at different stages,which alleviates the feature loss during network transmission and improves the feature utilization.Experimental results show that the proposed method can recover more facial details.On the Helen face dataset,the PSNR value of face image reconstructed by this algorithm is 27.74 dB,which is 1.47 dB and 1.12 dB higher than that of SISN and DICNet methods,respectively.Similarly,on the CelebA face dataset,the PSNR value of face image reconstructed by the proposed algorithm is 27.40 dB,which is 1.26 dB and 0.39 dB higher than that of SISN and DICNet methods,respectively.

Key words: face super-resolution, attention mechanism, residual network, feature fusion, encoder, decoder

中图分类号: