基于边缘与注意力跨层转移的图像修复模型

doi:10.19678/j.issn.1000-3428.0064758

摘要/Abstract

摘要： 针对现有基于深度学习的图像修复算法在处理大面积不规则缺损图像时出现局部结构不连通与模糊的问题，提出一种基于边缘和注意力跨层转移的二阶生成式图像修复模型。该模型由边缘修复网络和图像修补网络构成，边缘修复网络在自编码器的基础上结合扩张卷积对缺损图像的边缘二值图进行修复，并将边缘修复图作为先验条件与缺损图像一起输入到图像修补网络，在图像修补网络中，给出注意力跨层转移网络对各尺度编码特征由深到浅进行重构，并将重构特征图跳跃连接至解码层与对应潜在特征融合进行解码，提高各级解码层输出的上下文一致性，减少结构信息和语义特征丢失，最终得到修复图像。在Celeba、Facade、Places2这3个数据集上的实验结果表明，与当前主流算法相比，该方法平均L1损失降低了1.044%~3.801%，峰值信噪比和结构相似性分别提升了1.435~4.486 dB和1.789%~8.755%，不仅能够生成整体语义合理的内容，而且在局部结构连通性和纹理合成方面更符合人眼视觉感受。

关键词: 图像修复, 边缘修复, 扩张卷积, 注意力跨层转移网络, 跳跃连接

Abstract: To address the problems of local structural disconnection and blurring in existing deep learning-based image inpainting algorithms when processing large-area irregular defect images，a second-order generative image inpainting model based on edge and attention transfer across layers is proposed. The model consists of edge and image repair networks.The edge repair network is based on an autoencoder and is combined with dilated convolution to repair the edge binary image of the defect image.The edge repair image is then input into the image repair network together with the defect image as a prior condition. In the image repair network，a proposed Attention Transfer Network Across Layer （ATNAL） reconstructs the coding features of each scale from deep to shallow and connects the reconstructed feature map to the decoding layer and corresponding potential feature fusion for decoding.This improves the contextual consistency of the output of the decoding layer at all levels，reduces structural information，and eliminates high-level semantic features.A repaired image is finally obtained.Experimental results on the Celeba，Façade，and Places2 datasets show that the average L1 loss from this method is reduced by 1.044%-3.801% as compared with the current mainstream algorithm. In addition，the Peak Signal-to-Noise Ratio（PSNR） and Structural Similarity（SSIM） increase by 1.435-4.486 dB and 1.789%-8.755%，respectively，not only generates content with reasonable overall semantics，it is also consistent with human visual perception in terms of local structural connectivity and texture synthesis.

Key words: image inpainting, edge inpainting, dilation convolution, Attention Transfer Network Across Layer（ATNAL）, skip connection

中图分类号:

TP391

樊瑶, 石英男, 柏劲咸. 基于边缘与注意力跨层转移的图像修复模型[J]. 计算机工程, 2023, 49(6): 180-192.

FAN Yao, SHI Yingnan, BAI Jinxian. Image Inpainting Model Based on Edge and Attention Transfer Across Layers[J]. Computer Engineering, 2023, 49(6): 180-192.

https://www.ecice06.com/CN/Y2023/V49/I6/180

图/表 17

20230615165752

20230615165801

20230615165804

20230615165808

20230615165812

20230615165815

20230615165819

20230615165823

20230615165826

20230615165829

20230615165835

20230615165838

20230615165842

20230615165845

20230615165848

20230615165852

20230615165857

参考文献

[1] BERTALMIO M,SAPIRO G,CASELLES V,et al.Image inpainting[C]//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques.Washington D.C.,USA:IEEE Press,2000:417-424.
[2] BALLESTER C,BERTALMIO M,CASELLES V,et al.Filling-in by joint interpolation of vector fields and gray levels[J].IEEE Transactions on Image Processing,2001,10(8):1200-1211.
[3] CRIMINISI A,PEREZ P,TOYAMA K.Object removal by exemplar-based inpainting[C]//Proceedings of 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2003:362-378.
[4] CRIMINISI A,PEREZ P,TOYAMA K.Region filling and object removal by exemplar-based image inpainting[J].IEEE Transactions on Image Processing,2004,13(9):1200-1212.
[5] DING D,RAM S,RODRÍGUEZ J J.Image inpainting using nonlocal texture matching and nonlinear filtering[J].IEEE Transactions on Image Processing,2019,28(4):1705-1719.
[6] IIZUKA S,SIMO-SERRA E,ISHIKAWA H.Globally and locally consistent image completion[J].ACM Transactions on Graphics,2017,36(4):1-14.
[7] ZENG Y H,FU J L,CHAO H Y,et al.Learning pyramid-context encoder network for high-quality image inpainting[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:1486-1494.
[8] ZHENG C X,CHAM T J,CAI J F.Pluralistic image completion[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:1438-1447.
[9] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Proceedings of NIPSʼ14.Cambridge,USA:MIT Press,2014:27-38.
[10] PATHAK D,KRÄHENBÜHL P,DONAHUE J,et al.Context encoders:feature learning by inpainting[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:2536-2544.
[11] WANG Y,TAO X,QI X J,et al.Image inpainting via generative multi-column convolutional neural networks[EB/OL].[2022-04-10].https://arxiv.org/abs/1810.08771.
[12] YU J H,LIN Z,YANG J M,et al.Generative image inpainting with contextual attention[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:5505-5514.
[13] SAGONG M C,SHIN Y G,KIM S W,et al.PEPSI:fast image inpainting with parallel decoding network[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:11352-11360.
[14] HUI Z,LI J,WANG X M,et al.Image fine-grained inpainting[EB/OL].[2022-04-10].https://arxiv.org/abs/2002.02609.
[15] XIONG W,YU J H,LIN Z,et al.Foreground-aware image inpainting[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2019:5833-5841.
[16] NAZERI K,NG E,JOSEPH T,et al.EdgeConnect:structure guided image inpainting using edge prediction[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop.Washington D.C.,USA:IEEE Press,2019:3265-3274.
[17] REN Y R,YU X M,ZHANG R N,et al.StructureFlow:image inpainting via structure-aware appearance flow[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:181-190.
[18] YANG J,QI Z Q,SHI Y.Learning to incorporate structure knowledge for image inpainting[J].Artificial Intelligence,2020,34(7):12605-12612.
[19] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].[2022-04-10].https://arxivpreprintarxiv:1511.07122.
[20] PÉREZ P,GANGNET M,BLAKE A.Poisson image editing[C]//Proceedings of ACM SIGGRAPHʼ03.New York,USA:ACM Press,2003:313-318.
[21] YAN Z,LI X,LI M,et al.Shift-net:image inpainting via deep feature rearrangement[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2018:1-17.
[22] LIU H,JIANG B,XIAO Y,et al.Coherent semantic attention for image inpainting[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:4170-4179.
[23] LIU G,REDA F A,SHIH K J,et al.Image inpainting for irregular holes using partial convolutions[C]//Proceedings of 2018 IEEE Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2018:85-100.
[24] YU J,LIN Z,YANG J,et al.Free-form image inpainting with gated convolution[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:4471-4480.
[25] LI J,HE F,ZHANG L,et al.Progressive reconstruction of visual structure for image inpainting[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2019:5962-5971.
[26] LIU H,JIANG B,SONG Y,et al.Rethinking image inpainting via a mutual encoder-decoder with feature equalizations[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2020:725-741.
[27] GUO X,YANG H,HUANG D.Image inpainting via conditional texture and structure dual generation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2021:14134-14143.
[28] RADFORD A,METZ L,CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[EB/OL].[2022-04-10].https://arxiv.org/abs/1511.06434.
[29] WU Y L,SHUAI H H,TAM Z R,et al.Gradient normalization for generative adversarial networks[EB/OL].[2022-04-10].https://arxiv.org/abs/2109.02235.
[30] ISOLA P,ZHU J Y,ZHOU T H,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:5967-5976.
[31] JOHNSON J,ALAHI A,LI F F.Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2016:694-711.
[32] GATYS L A,ECKER A S,BETHGE M.Image style transfer using convolutional neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:2414-2423.
[33] WANG T C,LIU M Y,ZHU J Y,et al.High-resolution image synthesis and semantic manipulation with conditional GANs[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2018:8798-8807.
[34] DENG J,DONG W,SOCHER R,et al.ImageNet:a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2009:248-255.
[35] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2022-04-10].https://arxiv.org/abs/1409.1556.
[36] 杨昊,余映.利用通道注意力与分层残差网络的图像修复[J].计算机辅助设计与图形学学报,2021,33(5):671-681.YANG H,YU Y.Image inpainting using channel attention and hierarchical residual networks[J].Journal of Computer-Aided Design & Computer Graphics,2021,33(5):671-681.(in Chinese)
[37] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2022-04-10].https://arxiv.preprintarxiv:1412.6980.
[38] LIU Z W,LUO P,WANG X G,et al.Deep learning face attributes in the wild[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2015:3730-3738.
[39] TYLEČEK R,ŠÁRA R.Spatial pattern templates for recognition of objects with regular structure[C]//Proceedings of European Conference on Pattern Recognition.Berlin,Germany:Springer,2013:364-374.
[40] ZHOU B L,LAPEDRIZA A,KHOSLA A,et al.Places:a 10 million image database for scene recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(6):1452-1464.

选择文件类型/文献管理软件名称

选择包含的内容