基于Transformer的多阶段运动模糊图像修复网络

doi:10.19678/j.issn.1000-3428.0068246

摘要/Abstract

摘要：

运动模糊是导致图像退化的常见原因, 其限制了图像的可读性和后续处理效果。针对卷积网络感受野有限以及常规多阶段网络中信息丢失的问题, 提出一种基于Transformer的多阶段去模糊网络。网络采用多阶段编码器-解码器结构, 在单个阶段内和多个阶段间采用跳跃连接来增强信息的传递。首先, 高效Transformer模块采用通道注意力和深度卷积来处理图像的全局和局部信息; 其次, 多分支结构的前馈传播网络通过引入多个并行的分支, 实现了不同尺度和不同层次的特征提取和融合; 最后, 通过多阶段的残差处理实现更优的图像恢复结果。实验结果显示, 在GoPro数据集上该网络的峰值信噪比(PSNR)达到32.23 dB, 结构相似性指数(SSIM)达到0.955, 在HIDE数据集上PSNR和SSIM分别达到30.15 dB和0.930, 优于DeepDeblur、DeblurGAN-V2等模型。

关键词: 深度学习, Transformer模型, 注意力机制, 图像修复, 多尺度网络

Abstract:

Motion blur is a common cause of image degradation that limits image readability and subsequent processing. A multi-stage deblurring network based on the Transformer is proposed to address the limited receptive field of convolutional networks and information loss in conventional multi-stage networks. The network adopts a multi-stage encoder-decoder structure with skip connections within and between stages to enhance information propagation. First, an efficient Transformer module is used to process the global and local information of the image using channel attention and depthwise convolution. Second, a multi-branch feedforward network with multiple parallel branches is introduced to extract and fuse features at different scales and levels. Finally, superior image restoration results are achieved through multi-stage residual learning. Experimental results show that the proposed method achieves a Peak Signal-to-Noise Ratio (PSNR) of 32.23 dB and Structural Similarity Index Measure (SSIM) of 0.955 on the GoPro dataset, and a PSNR of 30.15 dB and SSIM of 0.930 on the HIDE dataset, demonstrating a performance superior to DeepDeblur, DeblurGAN-V2, and other models.

Key words: deep learning, Transformer model, attention mechanism, image restoration, multi-scale network

朱凯, 李理, 张彤, 江晟, 别一鸣. 基于Transformer的多阶段运动模糊图像修复网络[J]. 计算机工程, 2024, 50(9): 276-285.

ZHU Kai, LI Li, ZHANG Tong, JIANG Sheng, BIE Yiming. Multi-Stage Motion Blur Image Restoration Network Based on Transformer[J]. Computer Engineering, 2024, 50(9): 276-285.

https://www.ecice06.com/CN/Y2024/V50/I9/276

图/表 8

图1 不同多阶段去模糊模型结构

Fig.1 Different multi-stage deblurring model structures

图2 SE模块结构

Fig.2 SE module structure

图3 单分支与多分支结构

Fig.3 Single-branch and multi-branch structure

图4 多阶段去模糊网络结构

Fig.4 Multi-stage deblurring network structure

图5 不同模型的去模糊结果对比

Fig.5 Comparison of deblurring results of different models

参考文献 46

1	KALMAN R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35- 45. doi: 10.1115/1.3662552
2	LUCY L B. An iterative technique for the rectification of observed distributions. The Astronomical Journal, 1974, 79, 745. doi: 10.1086/111605
3	RICHARDSON W H. Bayesian-based iterative method of image restoration. Journal of the Optical Society of America, 1972, 62(1): 55. doi: 10.1364/JOSA.62.000055
4	KRISHNAN D, FERGUS R. Fast image deconvolution using hyper-Laplacian priors[EB/OL]. [2023-07-05]. https://proceedings.neurips.cc/paper/2009/file/3dd48ab31d016ffcbf3314df2b3cb9ce-Paper.pdf.
5	HRADIŠ M, KOTERA J, ZEM$ \stackrel{ˇ}{C} $ÍK P, et al. Convolutional neural networks for direct text deblurring[EB/OL]. [2023-07-05]. https://www.semanticscholar.org/paper/Convolutional-Neural-Networks-for-Direct-Text-Hradi%C5%A1-Kotera/423584105e5f6adc5981c87f2af8fc5ff2bb9064.
6	SCHULER C J, HIRSCH M, HARMELING S, et al. Learning to deblur. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(7): 1439- 1451. doi: 10.1109/TPAMI.2015.2481418
7	XU X, PAN J, ZHANG Y J, et al. Motion blur kernel estimation via deep learning. IEEE Transactions on Image Processing, 2018, 27(1): 194- 205. doi: 10.1109/TIP.2017.2753658
8	KUPYN O, BUDZAN V, MYKHAILYCH M, et al. DeblurGAN: blind motion deblurring using conditional adversarial networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8183-8192.
9	ZHANG H G, DAI Y C, LI H D, et al. Deep stacked hierarchical multi-patch network for image deblurring[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 5971-5979.
10	PUROHIT K, RAJAGOPALAN A N. Motion deblurring with an adaptive network[EB/OL]. [2023-07-05]. http://arxiv.org/abs/1903.11394v4.
11	NAH S, KIM T H, LEE K M. Deep multi-scale convolutional neural network for dynamic scene deblurring[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 257-265.
12	DONG J X, PAN J S, YANG Z B, et al. Multi-scale residual low-pass filter network for image deblurring[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2023: 12345-12354.
13	LIU P J, ZHANG H Z, ZHANG K, et al. Multi-level wavelet-CNN for image restoration[EB/OL]. [2023-07-05]. https://arxiv.org/pdf/1805.07071.
14	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2023-07-05]. https://arxiv.org/abs/2010.11929.
15	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2023-07-05]. https://arxiv.org/abs/1706.03762.
16	WANG Z D, CUN X D, BAO J M, et al. Uformer: a general U-shaped Transformer for image restoration[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 17662-17672.
17	LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 9992-10002.
18	ZAMIR S W, ARORA A, KHAN S, et al. Restormer: efficient Transformer for high-resolution image restoration[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 5718-5729.
19	ZHAO Q, YANG H, ZHOU D M, et al. Rethinking image deblurring via CNN-Transformer multiscale hybrid architecture. IEEE Transactions on Instrumentation Measurement, 2023, 72, 3230482.
20	KONG L S, DONG J X, GE J J, et al. Efficient frequency domain-based transformers for high-quality image deblurring[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 5886-5895.
21	CUI Y N, TAO Y, REN W Q, et al. Dual-domain attention for image deblurring. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(1): 479- 487. doi: 10.1609/aaai.v37i1.25122
22	KIM J, LEE J K, LEE K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 1646-1654.
23	TAO X, GAO H Y, SHEN X Y, et al. Scale-recurrent network for deep image deblurring[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8174-8182.
24	CHO S J, JI S W, HONG J P, et al. Rethinking coarse-to-fine approach in single image deblurring[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 4621-4630.
25	王向军, 欧阳文森. 多尺度循环注意力网络运动模糊图像复原方法. 红外与激光工程, 2022, 51(6): 460- 468. URL
	WANG X J, OUYANG W S. Multi-scale recurrent attention network for image motion deblurring. Infrared and Laser Engineering, 2022, 51(6): 460- 468. URL
26	ZHANG Y, LI Q, QI M, et al. Multi-scale frequency separation network for image deblurring[EB/OL]. [2023-07-05]. https://arxiv.org/abs/2206.00798.
27	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[EB/OL]. [2023-07-05]. https://arxiv.org/abs/1807.06521.
28	HASSANI A, WALTON S, LI J C, et al. Neighborhood attention Transformer[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 6185-6194.
29	RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[EB/OL]. [2023-07-05]. https://arxiv.org/abs/1505.04597.
30	GAO H Y, TAO X, SHEN X Y, et al. Dynamic scene deblurring with parameter selective sharing and nested skip connections[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 3843-3851.
31	HU J, SHEN L, SUN G. Squeeze-and-Excitation networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7132-7141.
32	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 2818-2826.
33	TSAI F J, PENG Y T, LIN Y Y, et al. Stripformer: strip Transformer for fast image deblurring[EB/OL]. [2023-07-05]. https://arxiv.org/abs/2204.04627.
34	SHI W Z, CABALLERO J, HUSZAR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 1874-1883.
35	赵敬伟, 林珊玲, 梅婷, 等. 基于YOLACT与Transformer相结合的实例分割算法研究. 半导体光电, 2023, 44(1): 134- 140. URL
	ZHAO J W, LIN S L, MEI T, et al. Research on instance segmentation algorithm based on YOLACT and Transformer. Semiconductor Optoelectronics, 2023, 44(1): 134- 140. URL
36	CHEN C F, PANDA R, FAN Q F. RegionViT: regional-to-local attention for Vision Transformers[EB/OL]. [2023-07-05]. http://arxiv.org/abs/2106.02689v3.
37	ZHANG H, WU C R, ZHANG Z Y, et al. ResNeSt: split-attention networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 2735-2745.
38	LIANG J Y, CAO J Z, SUN G L, et al. SwinIR: image restoration using Swin Transformer[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 1833-1844.
39	SHEN Z Y, WANG W G, LU X K, et al. Human-aware motion deblurring[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 5572-5581.
40	LOSHCHILOV I, HUTTER F. SGDR: stochastic gradient descent with warm restarts[EB/OL]. [2023-07-05]. http://arxiv.org/abs/1608.03983v5.
41	KUPYN O, MARTYNIUK T, WU J R, et al. DeblurGAN-V2: deblurring (orders-of-magnitude) faster and better[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 8877-8886.
42	PARK D, KANG D U, KIM J, et al. Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training[EB/OL]. [2023-07-05]. https://arxiv.org/abs/1911.07410.
43	WAN S D, TANG S, XIE X Z, et al. Deep convolutional-neural-network-based channel attention for single image dynamic scene blind deblurring. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(8): 2994- 3009. doi: 10.1109/TCSVT.2020.3035664
44	PUROHIT K, SUIN M, RAJAGOPALAN A N, et al. Spatially-adaptive image restoration using distortion-guided networks[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 2289-2299.
45	WANG J B, WANG Z Q, YANG A P. Iterative dual CNNs for image deblurring. Mathematics, 2022, 10(20): 3891. doi: 10.3390/math10203891
46	李现国, 李滨. 基于Transformer和多尺度CNN的图像去模糊. 计算机工程, 2023, 49(9): 226-233, 245. URL
	LI X G, LI B. Image deblurring based on Transformer and multi-scale CNN. Computer Engineering, 2023, 49(9): 226-233, 245. URL

[1]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[2]	李俊俊, 董建刚, 李坤. 基于Kubernetes的集群节能策略研究[J]. 计算机工程, 2024, 50(9): 82-91.
[3]	林畅, 郭伟, 任哲聪, 金海波. 基于Transformer的目标跟踪与分割统一算法[J]. 计算机工程, 2024, 50(9): 130-141.
[4]	李泽霖, 吕兆峰, 陈富强, 李克. 基于多跳信息融合的实体对齐模型[J]. 计算机工程, 2024, 50(9): 142-152.
[5]	王汝英, 马嘉骏, 董建强, 刘万龙, 张海涛, 尹凯, 赵博超. 基于MTS-BiGRU-DMHSA的工业负荷预测方法[J]. 计算机工程, 2024, 50(9): 169-178.
[6]	张天鹏, 韩晶, 吕学强. 基于多任务学习的超分辨率辅助小目标检测[J]. 计算机工程, 2024, 50(9): 304-312.
[7]	郭敏, 张熙涵, 李阳. 融合注意力的教师互一致性半监督医学图像分割[J]. 计算机工程, 2024, 50(9): 313-323.
[8]	高煜宝, 文志诚. 基于注意力机制的双路解码器图像去噪方法[J]. 计算机工程, 2024, 50(9): 324-332.
[9]	曾钰琦, 刘博, 钟柏昌, 钟瑾. 智慧教育下基于改进YOLOv8的学生课堂行为检测算法[J]. 计算机工程, 2024, 50(9): 344-355.
[10]	王言国, 吕鹏远, 兰金江, 刘明哲, 秦冠军, 张硕桦, 周宇. 基于对抗训练与Transformer的风力发电机故障分类方法[J]. 计算机工程, 2024, 50(9): 377-384.
[11]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[12]	饶日昕, 王怡文, 曾砺志, 童心恬, 赵海涛. 面向废旧电缆检测的轻量化网络模型[J]. 计算机工程, 2024, 50(8): 22-30.
[13]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[14]	王蕾, 党时鹏, 潘丰. 基于卷积神经网络的隐匿性旁路预测模型[J]. 计算机工程, 2024, 50(8): 40-49.
[15]	陈瀚, 赵春蕾, 蒋昊达, 王春东. 基于融合模型与语义网络的App用户意图识别研究[J]. 计算机工程, 2024, 50(8): 50-63.

选择文件类型/文献管理软件名称

选择包含的内容