作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 207-214. doi: 10.19678/j.issn.1000-3428.0065255

• 图形图像处理 • 上一篇    下一篇

结合对比感知损失和融合注意力的图像去雾模型

王可铮1, 徐玉芬2, 周尚波2   

  1. 1. 联想集团 基础设施与算力服务事业部, 成都 610199
    2. 重庆大学 计算机学院, 重庆 400044
  • 收稿日期:2022-07-15 出版日期:2023-08-15 发布日期:2023-08-15
  • 作者简介:

    王可铮(1984—),男,高级工程师、硕士,主研方向为大数据、信息工程

    徐玉芬,硕士研究生

    周尚波,教授、博士

  • 基金资助:
    国家自然科学基金(61762025)

Image Dehazing Model Combined with Contrastive Perceptual Loss and Fusion Attention

Kezheng WANG1, Yufeng XU2, Shangbo ZHOU2   

  1. 1. Division of Infrastructure and Computing Services, Lenovo Group, Chengdu 610199, China
    2. School of Computer Science, Chongqing University, Chongqing 400044, China
  • Received:2022-07-15 Online:2023-08-15 Published:2023-08-15

摘要:

基于深度学习的图像去雾方法主要是通过增加网络的深度或宽度来提升算法的性能,但是这样需要更多的计算资源,并且现有的去雾模型仅将无雾图像作为正样本来指导网络的训练,未能充分利用负样本即雾天图像。为了进一步利用雾天图像和无雾图像对之间的特征差异,并且更灵活地处理不同尺度、位置、范围、角度等区域特征,对重要的特征赋予更大的权重,在DehazeFormer-T模型基础上加入对比感知损失和融合注意力机制,提出改进的CFFormer模型。以L1损失函数度量真实图像和预测图像之间的重建损失,采用对比感知损失函数提取固定的预训练网络VGG16的权重,提升对比学习的能力,并将真实无雾图像和雾天图像分别作为正负样本,拉近预测图像和清晰图像,同时推远有雾图像。此外,将尺度注意力、空间注意力和通道注意力进行融合,在特征图的不同维度上分别应用注意力机制,使网络关注更重要的信息。实验结果表明,CFFormer在RESIDE的ITS数据集上PSNR和SSIM指标比DehazeFormer-T分别提高9.4%和0.6%,验证了模型的有效性。

关键词: 图像去雾, 深度学习, DehazeFormer模型, CFFormer模型, 对比感知损失, 融合注意力

Abstract:

Existing deep learning-based image dehazing methods mainly expand the depth or width of a network to improve its performance, requiring more computational resources. In addition, current dehazing models only adopt haze-free images as positives to guide the training of the network, leaving the negatives, that is hazy images, unattended.Therefore, it is crucial to further exploit the feature differences between hazy and haze-free image pairs in mainstream image dehazing models and deal with regional features of different scales, locations, ranges, and angles more flexibly, thus giving more attention and weight to important features.To this end, this study proposes CFFormer, an improved DehazeFormer-T model that adopts the Contrastive Perceptual Loss(CPL) and Fusion Attention(FA) mechanism.The L1 loss function is used as the reconstruction loss between the ground-truth image and the predicted image. Moreover, the CPL function is introduced because, using fixed pre-trained weights of VGG16, CPL increases the network's ability to learn contrastively. Furthermore, CPL considers real haze-free and a hazy images as positive and negative samples, respectively, bringing the predicted image closer to the clear image while pushing it away from the hazy image. The FA fuses three types of attention: scale, spatial, and channel attentions.These attention mechanisms are separately applied to different dimensions of the feature maps to ensure that the network pays more attention to important information. On the ITS dataset of RESIDE, the experimental results show that with respect to the PSNR and SSIM metrics, the proposed model is 9.4% and 0.6% higher than DehazeFormer-T, respectively, confirming the effectiveness of the proposed model.

Key words: image dehazing, deep learning, DehazeFormer model, CFFormer model, Contrastive Perceptual Loss(CPL), Fusion Attention(FA)