基于多模态可见光和红外图像融合的船舶检测方法

doi:10.19678/j.issn.1000-3428.0070436

摘要/Abstract

摘要：

单一模态图像在全天候的船舶检测中易受光照、天气等环境影响, 导致船舶检测精度低、漏检率高。为此, 提出了一种融合可见光与红外图像信息的船舶检测方法VIF-RTDETR。该方法根据可见光图像丰富的细节和颜色信息以及红外图像在低光照环境下的稳定表现, 构建了四通道输入模型; 设计可见光与红外图像信息的融合模块VIF, 实现了不同模态信息的互补融合, 使得在检测网络中更加合理利用两种模态的信息; 在主干Backbone特征提取网络中结合通道注意力, 为通道动态分配不同的权重, 以增强通道的特征表达能力来进一步优化特征提取能力。此外, 为进一步提升船舶检测中船舶小目标的检测性能, 设计了一种加权的边界框损失函数, 使模型能够有效地关注不同尺寸目标的特征表达, 提高模型在不同目标尺寸下的检测精度。实验结果表明, 在船舶可见光和红外数据集上, 该模型的检测精度AP_0.5∶0.95、AP_0.5分别达到了78.3%、98.5%, 相对于单一模态的可见光和红外模型的AP_0.5∶0.95分别提升了4.7、9.2百分点; 召回率AR_0.5∶0.95达到了85.2%, 相对于单一模态模型分别提升了3.1、7.3百分点, 显著提高船舶的检测精度且降低漏检情况。

关键词: 船舶检测, 可见光和红外图像, 全天候检测, 双模态融合, 注意力机制

Abstract:

Single-modal images are easily affected by light, weather, and other environmental conditions in all-weather ship detection. This leads to a low ship detection accuracy and high leakage rate. To address these issues, this paper proposes a ship detection method, VIF-RTDETR, which fuses visible light and infrared image information. The method fully utilizes the rich details and color information of visible images and the stable performance of infrared images in low-light environments, and constructs a four-channel input model. The complementary fusion of varied modal information is realized by designing the fusion module VIF such that it makes more reasonable use of the information from the two modalities (visible light and infrared) in the detection network. The channel attention in the backbone feature extraction network is combined to further optimize the feature extraction capability by dynamically assigning different weights to the channels, thereby enhancing the feature expression capability of the channels. To further enhance the detection performance of small targets in ship detection, a weighted bounding box loss function is designed so that the model can effectively focus on the feature expression of targets of different sizes and improve the detection accuracy under different target sizes. The experimental results show that in the visible and infrared datasets for the ships, the detection precision AP_0.5∶0.95, AP_0.5 of the model reaches 78.3% and 98.5%, respectively, reflecting improvements by 4.7 and 9.2 percentage points relative to AP_0.5∶0.95 the single-modal visible and infrared models. Further, the recall rate AR_0.5∶0.95 reaches 85.2%, reflecting improvements by 3.1 and 7.3 percentage points relative to the single-modal visible and infrared models, respectively. Thus, the findings contribute to significantly improving the precision of ship detection and reducing the leakage rate.

Key words: ship detection, visible and infrared images, all-weather detection, multimodal fusion, attention mechanism

于梦源, 刘向阳. 基于多模态可见光和红外图像融合的船舶检测方法[J]. 计算机工程, 2026, 52(6): 278-287.

YU Mengyuan, LIU Xiangyang. Ship Detection Method Based on Multimodal Visible and Infrared Image Fusion[J]. Computer Engineering, 2026, 52(6): 278-287.

https://www.ecice06.com/CN/Y2026/V52/I6/278

图/表 11

图1 RT-DETR网络结构

Fig.1 RT-DETR network structure

图2 多模态检测网络结构

Fig.2 Multimodal detection network structure

图3 可见光和红外融合模块

Fig.3 Visible and infrared fusion module

图4 Backbone特征提取结构

Fig.4 Backbone feature extraction structure

图5 数据集中船舶图像部分示例

Fig.5 Partial examples of the ship images on the dataset

图6 不同模型融合可见光和红外训练AP值对比

Fig.6 Comparison of training AP values among different models fusing visible and infrared light

图7 不同场景下部分检测结果

Fig.7 Partial detection results in different scenario

参考文献 31

1	宋志娜, 眭海刚, 李永成. 高分辨率可见光遥感图像舰船目标检测综述. 武汉大学学报(信息科学版), 2021, 46 (11): 1703- 1715.
	SONG Z N , SUI H G , LI Y C . A survey on ship detection technology in high-resolution optical remote sensing images. Geomatics and Information Science of Wuhan University, 2021, 46 (11): 1703- 1715.
2	马啸, 邵利民, 金鑫, 等. 舰船目标识别技术研究进展. 科技导报, 2019, 37 (24): 65- 78.
	MA X , SHAO L M , JIN X , et al. Advances in ship target recognition technology. Science & Technology Review, 2019, 37 (24): 65- 78.
3	YVKSEL G K, YALıTUNA B, TARTAR Ö F, et al. Ship recognition and classification using silhouettes extracted from optical images[C]//Proceedings of the 24th Signal Processing and Communication Application Conference (SIU). Zonguldak, Turkey: IEEE Press, 2016: 1617-1620.
4	樊怡颖, 呙维. 基于时序图像的双分支SAR图像船舶检测方法. 计算机工程, 2025, 51 (12): 31- 42. doi: 10.19678/j.issn.1000-3428.0070516
	FAN Y Y , GUO W . Dual-branch SAR image ship detection method based on time-series images. Computer Engineering, 2025, 51 (12): 31- 42. doi: 10.19678/j.issn.1000-3428.0070516
5	ZHAO H W , ZHANG W S , SUN H Y , et al. Embedded deep learning for ship detection and recognition. Future Internet, 2019, 11 (2): 53. doi: 10.3390/fi11020053
6	DONG Y X , CHEN F K , HAN S , et al. Ship object detection of remote sensing image based on visual attention. Remote Sensing, 2021, 13 (16): 3192. doi: 10.3390/rs13163192
7	HAN W X , KUERBAN A , YANG Y C , et al. Multi-vision network for accurate and real-time small object detection in optical remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2022, 19, 6001205.
8	牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法. 图学学报, 2024, 45 (4): 726- 735.
	NIU W H , GUO X . Rotating target detection algorithm in ship remote sensing images based on YOLOv8. Journal of Graphics, 2024, 45 (4): 726- 735.
9	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 213-229.
10	ZHU X Z, SU W J, LU L W, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. [2024-09-01]. https://arxiv.org/pdf/2010.04159.
11	DAI Z G, CAI B L, LIN Y G, et al. UP-DETR: unsupervised pre-training for object detection with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE Press, 2021: 1601-1610.
12	MENG D P, CHEN X K, FAN Z J, et al. Conditional DETR for fast training convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE Press, 2022: 3631-3640.
13	ROH B, SHIN J W, SHIN W, et al. Sparse DETR: efficient end-to-end object detection with learnable sparsity[EB/OL]. [2024-09-01]. https://arxiv.org/pdf/2111.14330.
14	何智杰, 肖玮, 柯学良, 等. 基于改进RT-DETR的相似背景干扰场景目标检测算法. 光电工程, 2025, 52 (10): 250132.
	HE Z J , XIAO W , KE X L , et al. Object detection algorithm based on improved RT-DETR in similar background interference scenario. Opto-Electronic Engineering, 2025, 52 (10): 250132.
15	LIU S L, LI F, ZHANG H, et al. DAB-DETR: dynamic anchor boxes are better queries for DETR[EB/OL]. [2024-09-01]. https://arxiv.org/pdf/2201.12329.
16	李士博, 肖振久, 曲海成, 等. 面向SAR图像舰船检测的多粒度特征与形位相似度量方法. 光电工程, 2025, 52 (2): 240254.
	LI S B , XIAO Z J , QU H C , et al. Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images. Opto-Electronic Engineering, 2025, 52 (2): 240254.
17	陈振. 红外舰船检测与目标识别方法研究[D]. 哈尔滨: 哈尔滨工程大学, 2016.
	CHEN Z. Methods of infrared ship detection and target recognition[D]. Harbin: Harbin Engineering University, 2016. (in Chinese)
18	LU Y Q, MA H J, SMART E, et al. Fusion of camera-based vessel detection and AIS for maritime surveillance[C]//Proceedings of the 26th International Conference on Automation and Computing (ICAC). Portsmouth, United Kingdom: IEEE Press, 2021: 1-6.
19	赵炜东, 郭鹏宇, 刘勇, 等. 基于可见光与红外卫星图像融合的舰船目标检测. 上海航天(中英文), 2023, 40 (1): 44- 52.
	ZHAO W D , GUO P Y , LIU Y , et al. Ship detection based on visible and infrared satellite image fusion. Aerospace Shanghai (Chinese & English), 2023, 40 (1): 44- 52.
20	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE Press, 2016: 779-788.
21	王昱婷, 刘志明, 万亚平, 等. 基于可见光与红外图像的弱光条件下目标检测. 计算机工程, 2024, 50 (8): 270- 281. doi: 10.19678/j.issn.1000-3428.0068186
	WANG Y T , LIU Z M , WAN Y P , et al. Target detection under low light conditions based on visible and infrared images. Computer Engineering, 2024, 50 (8): 270- 281. doi: 10.19678/j.issn.1000-3428.0068186
22	李海军, 孔繁程, 林云. 基于改进YOLOv5s的红外舰船检测算法. 系统工程与电子技术, 2023, 45 (8): 2415- 2422.
	LI H J , KONG F C , LIN Y . Infrared ship detection algorithm based on improved YOLOv5s. Systems Engineering and Electronics, 2023, 45 (8): 2415- 2422.
23	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE Press, 2024: 16965-16974.
24	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 7132-7141.
25	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE Press, 2016: 770-778.
26	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE Press, 2020: 658-666.
27	ZHANG H, LI F, LIU S L, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection[EB/OL]. [2024-09-01]. https://arxiv.org/pdf/2203.03605.
28	LI F, ZHANG H, LIU S L, et al. DN-DETR: accelerate DETR training by introducing query denoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 13609-13617.
29	HOU X Q, LIU M Q, ZHANG S L, et al. Salience DETR: enhancing detection transformer with hierarchical salience filtering refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE Press, 2024: 17574-17583.
30	HOU X Q, LIU M Q, ZHANG S L, et al. Relation DETR: exploring explicit position relation prior for object detection[EB/OL]. [2024-09-01]. https://arxiv.org/pdf/2407.11699.
31	ZHAO C Y, SUN Y F, WANG W H, et al. MS-DETR: efficient DETR training with mixed supervision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE Press, 2024: 17027-17036.

[1]	胡康源, 郭涛, 穆楠. 基于自注意力机制和动态掩膜机制的文物图像修复方法[J]. 计算机工程, 2026, 52(6): 179-188.
[2]	王永旗, 王雷. 基于跨模态增强与时间步门控的多模态情感识别[J]. 计算机工程, 2026, 52(6): 258-267.
[3]	曾安, 郑嘉裕, 潘丹, 赵靖亮, 黄幸青. 基于深度强化学习的主动脉夹层中心线追踪算法[J]. 计算机工程, 2026, 52(6): 414-424.
[4]	周丽君, 张俊然, 王开元, 向军莲. 融合患者临床体征的图增强注意力药物推荐[J]. 计算机工程, 2026, 52(6): 314-325.
[5]	代尹翘, 肖武龙, 李柏林, 李立. 基于改进YOLOv5s的莴笋芯部检测算法[J]. 计算机工程, 2026, 52(6): 352-364.
[6]	罗恒, 万良. 基于动态时空图神经网络的网络流量入侵检测方法[J]. 计算机工程, 2026, 52(6): 202-213.
[7]	肖泽秋, 李勇, 王霞. 基于PBI-CLA模型的糖尿病患者血糖浓度预测[J]. 计算机工程, 2026, 52(6): 382-390.
[8]	瞿靖鸿, 王中卿, 周国栋. 基于预训练模型的问答知识文本生成[J]. 计算机工程, 2026, 52(5): 326-335.
[9]	吴沛颖, 李晓慧, 王俊峰. 基于上下文感知语言模型的C2流量检测[J]. 计算机工程, 2026, 52(5): 270-280.
[10]	张红, 朱思雨, 张玺君, 魏轿云. 基于自适应图卷积优化元图学习的非平稳交通流预测研究[J]. 计算机工程, 2026, 52(5): 456-466.
[11]	宋天泽, 曹从军, 何佳琪, 王旭升, 刘晨煜. 基于改进DETR的密集行人检测算法研究[J]. 计算机工程, 2026, 52(5): 250-258.
[12]	杨家豪, 王雷. 基于多特征时空推理网络的个体关注目标检测[J]. 计算机工程, 2026, 52(5): 184-191.
[13]	尹恒杰, 郑克清, 柯建楠, 董云泉. 基于本地动量加速的非独立同分布联邦学习方法[J]. 计算机工程, 2026, 52(4): 103-110.
[14]	李娇, 范浩东, 洪旭东, 许镇义, 樊旭, 黄俊. 基于标签视觉原型学习的多标签图像分类[J]. 计算机工程, 2026, 52(4): 229-238.
[15]	汤伟博, 方强, 李沛根, 艾龙金, 熊金红, 夏海廷. 基于RSD-YOLO的无人机航拍图像小目标检测[J]. 计算机工程, 2026, 52(4): 214-228.

选择文件类型/文献管理软件名称

选择包含的内容