作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 270-281. doi: 10.19678/j.issn.1000-3428.0068186

• 图形图像处理 • 上一篇    下一篇

基于可见光与红外图像的弱光条件下目标检测

王昱婷*(), 刘志明, 万亚平, 朱涛   

  1. 南华大学计算机学院, 湖南 衡阳 421200
  • 收稿日期:2023-08-04 出版日期:2024-08-15 发布日期:2024-08-25
  • 通讯作者: 王昱婷
  • 基金资助:
    国家自然科学基金(62071213)

Target Detection Under Low Light Conditions Based on Visible and Infrared Images

Yuting WANG*(), Zhiming LIU, Yaping WAN, Tao ZHU   

  1. School of Computer, University of South China, Hengyang 421200, Hunan, China
  • Received:2023-08-04 Online:2024-08-15 Published:2024-08-25
  • Contact: Yuting WANG

摘要:

图像融合是将多个输入图像合并成一个单一图像的技术。可见光红外图像融合能提高目标检测的准确性, 但在低光照场景下往往效果不佳。基于此, 提出一种新的融合模型DAPR-Net。该模型具有跨层残差连接的编解码结构, 将编码器的输出与解码器的对应层的输入相连接, 加强各层卷积层间的信息传递。在编码器中设计了双注意力特征提取模块AFEM, 使得网络能够更好地区分融合图像与输入的可见光和红外图像之间的差异, 同时保留两者的关键信息。在多个公开数据集上与6种先进方法进行对比, 实验结果表明, 与基准PIAFusion模型相比, 该模型在LLVIP和MSRS数据集上的信息熵、空间频率、平均梯度、标准差、视觉保真度指标分别提高了0.849、3.252、7.634、10.38、0.293和2.105、2.23、4.099、27.938、0.343;在YOLOV5目标检测网络上, LLVIP和MSRS数据集的平均精度均值、召回率、精确率、F1值指标分别提高了8.8、1.4、1.9、1.5个百分点和7.5、1.4、8.8、1.2个百分点, 相较于其他融合方法表现出更显著的优势。

关键词: 低光照, 可见光, 红外图像, 图像融合, 目标检测

Abstract:

Image fusion is the process of combining multiple input images into a unified single image. Although visible-infrared image fusion enhances target detection accuracy, its performance often fails in low-light scenarios. This study introduces a novel fusion model, namely, DAPR-Net, which features an encoder-decoder structure with cross-layer residual connections. These connections link the encoder's output to the corresponding layer in the decoder to thereby reinforce the information flow between the convolutional layers. Within the encoder, a Dual Attention Feature Extraction Module(AFEM) is designed to better distinguish the differences between the fused image and the input visible light and infrared images while retaining crucial information from both. The experimental results show that compared with the benchmark PIAFusion model, the information entropy, spatial frequency, mean gradient, standard deviation, and visual fidelity indices on the model on LLVIP and MSRS datasets increase by 0.849, 3.252, 7.634, 10.38, and 0.293, and 2.105, 2.23, 4.099, 27.938, and 0.343, respectively. In the YOLOV5 target detection network, the average mean precision, recall rate, accuracy rate, and F1 value index of the LLVIP and MSRS datasets increased by 8.8, 1.4, 1.9, and 1.5 percentage points and 7.5, 1.4, 8.8, and 1.2 percentage points, respectively, showing significant advantages compared with other fusion methods.

Key words: low-light, visible, infrared image, image fusion, target detection