结合改进YOLOv5s和动态数据增强的海面舰船检测

doi:10.19678/j.issn.1000-3428.0069459

摘要/Abstract

摘要：

海面成像过程易受天气、光照、水雾等因素的影响，针对海面舰船检测过程中的小目标模糊、目标尺度差异大、类别不均衡等问题，设计一种动态“复制-粘贴”的数据增强方式，将其嵌入到YOLOv5框架，提出了一种改进YOLOv5s的海面目标检测算法。在主干网络中，设计浅层局部感知模块，混合空洞卷积、深度可分离卷积与残差连接支路以并联的方式提升模块感受野，加强提取局部细节信息的能力；在颈部网络中，设计了注意力融合模块，利用空间注意力机制与通道注意力机制，聚合浅层空间信息与深层语义信息，提高网络特征表达能力；在检测输出中，通过对其相邻的浅层检测头特征进行下采样与融合，设计了层级融合解耦头，提升了目标分类与定位精度。动态“复制-粘贴”数据增强策略从训练集图像中裁剪目标，存入目标样本库，在每个训练轮次中，根据目标分布的概率，从样本库中随机选取目标，进行一定比例的几何与光度变换后，随机粘贴至训练图像中，从而提升前景目标密度。在SMD-Plus数据集上的实验结果表明，所提算法的mAP@0.5、mAP@0.5 ∶0.95与YOLOv5s模型相比分别提升了6.7和5.2百分点。在WSODD数据集上开展迁移实验，所提算法的mAP@0.5、mAP@0.5 ∶0.95与YOLOv5s模型相比分别提升3.7和3.3百分点。改进后的算法与提出的动态数据增强方法能有效缓解类别与尺寸不均衡问题，提高小目标检测精度，适用于海面场景下的舰船检测任务。

关键词: 舰船检测, 数据增强, 多尺度特征, 小目标检测, 注意力机制

Abstract:

To address the challenges of small object blurring, large object scale difference, and category imbalance in ship detection, this paper designs a dynamic ″copy-paste″ data augmentation method, embeds it into the YOLOv5 model, and proposes an improved YOLOv5s algorithm for sea surface object detection. In the backbone network, a shallow local perception module is introduced to improve the receptive field by combining a hybrid dilated convolution, depthwise separable convolution, and residual connection branch in parallel. This enhances the extraction of detailed local information. In the neck network, an attention fusion module is designed to aggregate shallow spatial information and deep semantic information using spatial and channel attention mechanisms, respectively. This improves the feature expression capability of the network. For the detection head, a hierarchical fusion decoupling head is designed by downsampling and fusing features from the adjacent shallow detection head to enhance object classification and positioning accuracy. The dynamic ″copy-paste″ data augmentation strategy involves extracting objects from training set images and storing them in a target sample library. During each training epoch, targets are randomly selected from this library based on their probability distribution values. After applying geometric and photometric transformations in certain proportions, these targets are pasted into the training images to increase the foreground target density. The SMD-Plus dataset is used for experimental verification. The experimental results show that mAP@0.5 and mAP@0.5 ∶95 values for the proposed algorithm are improved by 6.7 and 5.2 percentage points, respectively, compared with the YOLOv5s model. Migration experiments are conducted on the WSODD dataset, and mAP@0.5 and mAP@0.5 ∶95 values are improved by 3.7 and 3.3 percentage points, respectively. Additionally, the improved algorithm and the proposed dynamic data augmentation method alleviate the problems of class and size imbalance, improve the detection accuracy of small targets, and are suitable for ship detection tasks.

Key words: ship detection, data augmentation, multi-scale features, small object detection, attention mechanism

马淦, 谷雨, 彭冬亮. 结合改进YOLOv5s和动态数据增强的海面舰船检测[J]. 计算机工程, 2025, 51(9): 294-305.

MA Gan, GU Yu, PENG Dongliang. Combining Improved YOLOv5s and Dynamic Data Augmentation for Sea Surface Ship Detection[J]. Computer Engineering, 2025, 51(9): 294-305.

https://www.ecice06.com/CN/Y2025/V51/I9/294

图/表 17

图1 改进后YOLOv5s模型

Fig.1 Improved YOLOv5s model

图2 浅层局部感知模块结构

Fig.2 Structure of shallow local perception module

图3 深度可分离卷积结构

Fig.3 Structure of depthwise separable convolution

图4 注意力融合模块结构

Fig.4 Structure of attention fusion module

图5 层级融合解耦头结构

Fig.5 Structure of hierarchical fusion decoupling head

图6 SMD-Plus数据集图像

Fig.6 SMD-Plus dataset images

图7 裁剪入库示意图

Fig.7 Schematic diagram of crop and save

图8 动态变换示意图

Fig.8 Schematic diagram of dynamic transformation

图9 动态粘贴示意图

Fig.9 Schematic diagram of dynamic paste

图10 SMD-Plus数据集检测图

Fig.10 Detection images of SMD-Plus dataset

图11 WSODD数据集检测图

Fig.11 Detection images of WSODD dataset

参考文献 44

1	叶晨, 逯天洋, 肖潏灏, 等. 海事监控视频舰船目标检测研究现状与展望. 中国图象图形学报, 2022, 27 (7): 2078- 2093.
	YE C , LU T Y , XIAO Y H , et al. Maritime surveillance videos based ships detection algorithms: a survey. Journal of Image and Graphics, 2022, 27 (7): 2078- 2093.
2	张玉莲. 光学图像海面舰船目标智能检测与识别方法研究[D]. 北京: 中国科学院大学, 2021.
	ZHANG Y L. Research on intelligent detection and recognition methods of ship targets on the sea surface in optical images[D]. Beijing: University of Chinese Academy of Sciences, 2021. (in Chinese)
3	黄泽贤, 吴凡路, 傅瑶, 等. 基于深度学习的遥感图像舰船目标检测算法综述. 光学精密工程, 2023, 31 (15): 2295- 2318.
	HANG Z X , WU F L , FU Y , et al. Review of deep learning-based algorithms for ship target detection from remote sensing images. Optics and Precision Engineering, 2023, 31 (15): 2295- 2318.
4	CHENG S X , ZHU Y S , WU S H . Deep learning based efficient ship detection from drone-captured images for maritime surveillance. Ocean Engineering, 2023, 285 (2): 115440.
5	GIRSHICK R , DONAHUE J , DARRELL T , et al. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38 (1): 142- 158.
6	GIRSHICK R. Fast R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2015: 1440-1448.
7	REN S Q , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
8	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 6154-6162.
9	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
10	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 1-11.
11	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 213-229.
12	LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2021: 9992-10002.
13	LI A F , ZHU X F , HE S , et al. Water surface object detection using panoramic vision based on improved single-shot multibox detector. EURASIP Journal on Advances in Signal Processing, 2021, 2021, 1- 15. doi: 10.1186/s13634-020-00710-6
14	ZHOU Z G , SUN J E , YU J B , et al. An image-based benchmark dataset and a novel object detector for water surface object detection. Frontiers in Neurorobotics, 2021, 15, 723336. doi: 10.3389/fnbot.2021.723336
15	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2020: 10778-10787.
16	HE K M , ZHANG X Y , REN S Q , et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (9): 1904- 1916. doi: 10.1109/TPAMI.2015.2389824
17	HAN X , ZHAO L N , NING Y , et al. ShipYOLO: an enhanced model for ship detection. Journal of Advanced Transportation, 2021, 2021, 1- 11.
18	童小钟, 魏俊宇, 苏绍璟, 等. 融合注意力和多尺度特征的典型水面小目标检测. 仪器仪表学报, 2023, 44 (1): 212- 222.
	TONG X Z , WEI J Y , SU S J , et al. Typical small target detection on water surfaces fusing attention and multi-scale features. Chinese Journal of Scientific Instrument, 2023, 44 (1): 212- 222.
19	马赛, 解志斌, 邵长斌. 融合位置信息和上下文的水面目标检测方法. 小型微型计算机系统, 2024, 45 (9): 2221- 2227.
	MA S , XIE Z B , SHAO C B . Water surface object detection method that combines positional information and context. Journal of Chinese Computer Systems, 2024, 45 (9): 2221- 2227.
20	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 13708-13717.
21	TAN M X, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//Proceedings of the 36th International Conference on Machine Learning. [S. l. ]: AAAI Press, 2019: 6105-6114.
22	LI Y , YAO T , PAN Y , et al. Contextual transformer networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45 (2): 1489- 1500.
23	周金涛, 高迪驹, 刘志全. 基于全景视觉的无人船水面障碍物检测方法. 计算机工程, 2024, 50 (2): 113- 121. doi: 10.19678/j.issn.1000-3428.0067238
	ZHOU J T , GAO D J , LIU Z Q . Detection method of water-surface obstacles for unmanned ships based on panoramic vision. Computer Engineering, 2024, 50 (2): 113- 121. doi: 10.19678/j.issn.1000-3428.0067238
24	DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. [2024-02-27]. https://arxiv.org/pdf/1708.04552.pdf.
25	ZHANG H Y, CISSE M, DAUPHIN Y N, et al. MixUp: beyond empirical risk minimization[EB/OL]. [2024-02-27]. https://arxiv.org/pdf/1710.09412.pdf.
26	YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features//Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2019: 6022-6031.
27	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-02-27]. https://arxiv.org/pdf/2004.10934.pdf.
28	CUBUK E D, ZOPH B, MANÉ D, et al. AutoAugment: learning augmentation strategies from data[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2019: 113-123.
29	LIM S B, KIM L, KIM T, et al. Fast autoaugment[C]//Proceedings of the 33rd Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 1-11.
30	DWIBEDI D, MISRA I, HEBERT M. Cut, paste and learn: surprisingly easy synthesis for instance detection[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV). Washington D. C., USA: IEEE Press, 2017: 1310-1319.
31	KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[EB/OL]. [2024-02-27]. https://arxiv.org/abs/1902.07296.
32	GHIASI G, CUI Y, SRINIVAS A, et al. Simple copy-paste is a strong data augmentation method for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2021: 2917-2927.
33	SUO Z , ZHAO Y , CHEN S , et al. BoxPaste: an effective data augmentation method for SAR ship detection. Remote Sensing, 2022, 14 (22): 5761. doi: 10.3390/rs14225761
34	KIM J H , KIM N , PARK Y W , et al. Object detection and classification based on YOLOv5 with improved maritime dataset. Journal of Marine Science and Engineering, 2022, 10 (3): 377. doi: 10.3390/jmse10030377
35	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7132-7141.
36	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2017: 936-944.
37	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8759-8768.
38	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI Press, 2020: 12993-13000.
39	陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测. 光电工程, 2022, 49 (3): 69- 81.
	CHEN X , PENG D L , GU Y . Real-time object detection for UAV images based on improved YOLOv5s. Opto-Electronic Engineering, 2022, 49 (3): 69- 81.
40	WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Washington D. C., USA: IEEE Press, 2018: 1451-1460.
41	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2024-02-27]. https://arxiv.org/abs/2107.08430.
42	CHEN X L, FANG H, LIN T Y, et al. Microsoft COCO captions: data collection and evaluation server[EB/OL]. [2024-02-27]. https://arxiv.org/pdf/1504.00325.
43	LI C Y, LI L L, JIANG H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[EB/OL]. [2024-02-27]. https://arxiv.org/pdf/2209.02976.pdf.
44	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2023: 7464-7475.

[1]	黄金贵, 刘朋, 唐文胜. MMD-YOLOv7:黑暗条件下车辆检测方法[J]. 计算机工程, 2025, 51(9): 340-349.
[2]	符家成, 田瑾, 张玉金, 方志军. 结合前置三元组集的知识图谱推荐[J]. 计算机工程, 2025, 51(9): 101-109.
[3]	翟志鹏, 曹阳, 沈琴琴, 施佺. 基于多时空图融合与动态注意力的交通流预测[J]. 计算机工程, 2025, 51(9): 139-148.
[4]	王舒梦, 徐慧英, 朱信忠, 黄晓, 宋杰, 李毅. 基于改进YOLOv8n的航拍轻量化小目标检测算法: PECS-YOLO[J]. 计算机工程, 2025, 51(9): 280-293.
[5]	朱思远, 李佳圣, 邹丹平, 何迪, 郁文贤. 基于半监督学习的非结构化道路缺陷检测算法[J]. 计算机工程, 2025, 51(9): 14-24.
[6]	李小雨, 罗娜. 基于迁移类内变化增强数据的小样本学习方法[J]. 计算机工程, 2025, 51(9): 242-251.
[7]	陈彦如, 刘珂良, 冉茂亮. 基于深度强化学习的外卖即时配送实时优化[J]. 计算机工程, 2025, 51(9): 328-339.
[8]	王帅, 史艳翠. 基于个性化数据增强的自监督序列推荐算法[J]. 计算机工程, 2025, 51(8): 190-202.
[9]	倪源松, 韩军, 邹小燕, 胡广怡, 王文帅. 两阶段自适应分块输电线路螺栓缺陷检测方法[J]. 计算机工程, 2025, 51(8): 281-291.
[10]	郝宏达, 罗健旭. 基于多尺度区域特征融合的多器官语义分割模型[J]. 计算机工程, 2025, 51(8): 270-280.
[11]	张昭理, 李家豪, 刘海, 石佛波, 何嘉文. 基于个性化遗忘建模的知识追踪方法[J]. 计算机工程, 2025, 51(8): 120-130.
[12]	闫建红, 刘芝妍, 王震. 融合时空注意力机制的多尺度卷积车辆轨迹预测[J]. 计算机工程, 2025, 51(8): 406-414.
[13]	刘春霞, 孟吉星, 潘理虎, 龚大立. 融合RGB与IR图像的遥感小目标检测方法[J]. 计算机工程, 2025, 51(7): 326-338.
[14]	栾孟娜, 郑秋梅, 王风华. 基于DMC-YOLO的交通标志实时检测算法[J]. 计算机工程, 2025, 51(7): 90-99.
[15]	彭菊红, 张弛, 高谦, 张光明, 谈栋华, 赵明俊. 基于改进的YOLOv8算法的钢材缺陷检测[J]. 计算机工程, 2025, 51(7): 152-160.

选择文件类型/文献管理软件名称

选择包含的内容