作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (7): 326-338. doi: 10.19678/j.issn.1000-3428.0069510

• 图形图像处理 • 上一篇    下一篇

融合RGB与IR图像的遥感小目标检测方法

刘春霞1, 孟吉星1, 潘理虎1,*(), 龚大立2   

  1. 1. 太原科技大学计算机科学与技术学院,山西 太原 030024
    2. 精英数智科技股份有限公司,山西 太原 030032
  • 收稿日期:2024-03-07 出版日期:2025-07-15 发布日期:2024-06-25
  • 通讯作者: 潘理虎
  • 基金资助:
    山西省基础研究项目(202203021221145); 山西省研究生联合培养示范基地项目(2022JD11)

Remote Sensing Small-Target Detection Method with Fusion of RGB and IR Images

LIU Chunxia1, MENG Jixing1, PAN Lihu1,*(), GONG Dali2   

  1. 1. College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, Shanxi, China
    2. Jingying Shuzhi Technology Co., Ltd., Taiyuan 030032, Shanxi, China
  • Received:2024-03-07 Online:2025-07-15 Published:2024-06-25
  • Contact: PAN Lihu

摘要:

针对现有的目标检测方法在处理背景复杂、有效信息量少的遥感图像时存在的误检、漏检等问题,提出了一种多模态遥感小目标检测方法——BFMYOLO。设计了像素级的红-绿-蓝(RGB)和红外(IR)图像的融合模块, 即多模态融合模块(BFM),充分利用不同模态的互补性,实现两种模态信息的有效融合;设计了全尺度自适应更新模块(AA),解决特征融合过程中的多目标信息冲突问题,通过结合CARAFE上采样算子并进一步融入浅层特征,在加强非相邻层间融合的同时增强小目标的空间信息;设计了改进的任务解耦检测头(IDHead),将分类和回归任务分开处理,以降低不同任务的相互干扰,融合深层语义特征,进一步提升模型的检测性能。采用归一化Wasserstein距离(NWD)损失函数作为定位回归损失函数,降低位置偏差的敏感性。实验结果表明,该方法在VEDAI、NWPU VHR-10和DIOR数据集上的阈值设定为0.5时的均值平均精度(mAP@0.5)分别达到78.6%、95.5%和73.3%,优于其他先进模型, 在遥感小目标检测中表现出良好的性能。

关键词: 遥感目标检测, 可见光和红外图像, 轻量级上采样算子, 注意力机制, 特征融合

Abstract:

A multimodal remote sensing small-target detection method, BFMYOLO, is proposed to address misdetection and omission issues in remote sensing images with complex backgrounds and less effective information. The method utilizes a pixel-level Red-Green-Blue (RGB) and infrared (IR) image fusion module, namely, the Bimodal Fusion Module (BFM), for effectively making full use of the complementarity of different modes to realize the effective fusion of information from two modalities. In addition, a full-scale adaptive updating module, AA, is introduced to resolve multitarget information conflicts during feature fusion. This module incorporates the CARAFE up-sampling operator and shallow features to enhance non-neighboring layer fusion and improve the spatial information of small targets. An Improved task decoupling Detection Head (IDHead) is designed to handle classification and regression tasks separately, thereby reducing the mutual interference between different tasks and enhancing detection performance by fusing deeper semantic features. The proposed method adopts the Normalized Wasserstein Distance (NWD) loss function as the localization regression loss function to mitigate positional bias sensitivity. Results of experiments on the VEDAI, NWPU VHR-10, and DIOR datasets demonstrate the superior performance of the model, with mean Average Precision when the threshold is set to 0.5 (mAP@0.5) of 78.6%, 95.5%, and 73.3%, respectively. The model thus outperforms other advanced models in remote sensing small-target detection.

Key words: remote sensing target detection, visible and infrared image, lightweight upsampling operators, attention mechanisms, feature fusion