基于改进YOLOv8的交通场景实例分割算法

doi:10.19678/j.issn.1000-3428.0068677

摘要/Abstract

摘要：

提出一种基于改进型YOLOv8的实例分割算法(DE-YOLO)。为减少图像中复杂背景的干扰, 引入高效多尺度注意力机制, 跨维交互使各特征组内空间语义特征平均分布。在主干网络部分, 使用可变形卷积DCNv2结合C2f卷积层, 突破原始卷积限制, 提升可变性。为减小有害梯度并提升检测器精度, 采用动态非单调聚焦机制Wise-交并比(WIoU)替代联合完全交并(CIoU)损失函数进行质量评估, 优化检测框定位, 提升分割精度。同时, 通过开启Mixup数据增强处理, 充实数据集, 丰富训练特征, 提升模型学习能力。实验结果表明, DE-YOLO在城市景观数据集Cityscapes中的掩模平均精度均值(mAP_mask)较基准模型YOLOv8n-seg提高了2.0百分点, IoU阈值为0.5时的平均精度提升了3.2百分点, 所提算法在提升精度的同时, 保持了优良的检测速度和较少的参数量, 模型参数量较同类模型低2.2~31.3百分点。

关键词: YOLOv8网络, 实例分割, 高效多尺度注意力, 可变形卷积, 损失函数

Abstract:

An instance segmentation algorithm (DE-YOLO) based on the improved YOLOv8 is proposed. To decrease the effect of complex backgrounds in the images, efficient multiscale attention is introduced, and cross-dimensional interaction ensures an even spatial feature distribution within each feature group. In the backbone network, a deformable convolution using DCNv2 is combined with a C2f convolutional layer to overcome the limitations of traditional convolutions and increase flexibility. This is performed to reduce harmful gradient effects and improve the overall accuracy of the detector. The dynamic nonmonotonic Wise-Intersection-over-Union (WIoU) focusing mechanism is employed instead of the traditional Complete Intersection-over-Union (CIoU) loss function to evaluate the quality, optimize detection frame positioning, and improve segmentation accuracy. Meanwhile, Mixup data enhancement processing is enabled to enrich the training features of the dataset and improve the learning ability of the model. The experimental results demonstrate that DE-YOLO improves the mean Average Precision of mask(mAP_mask) and mAP_mask@0.5 by 2.0 and 3.2 percentage points compared with the benchmark model YOLOv8n-seg in the Cityscapes dataset of urban landscapes, respectively. Furthermore, DE-YOLO maintains an excellent detection speed and small parameter quantity while exhibiting improved accuracy, with the model requiring 2.2-31.3 percentage points fewer parameters than similar models.

Key words: YOLOv8 network, instance segmentation, efficient multi-scale attention, deformable convolution, loss function

赵南南, 高翡晨. 基于改进YOLOv8的交通场景实例分割算法[J]. 计算机工程, 2025, 51(1): 198-207.

ZHAO Nannan, GAO Feichen. Improved YOLOv8-based Algorithm for Instance Segmentation in Traffic Scenes[J]. Computer Engineering, 2025, 51(1): 198-207.

https://www.ecice06.com/CN/Y2025/V51/I1/198

图/表 12

图1 YOLOv8n-seg网络结构

Fig.1 Structure of YOLOv8n-seg network

图2 DE-YOLOv8n-seg网络整体架构

Fig.2 Overall architecture of DE-YOLOv8n-seg network

图3 高效多尺度注意力机制

Fig.3 Efficient multi-scale attention mechanism

图4 引入可变形卷积的C2f模块

Fig.4 C2f module with deformable convolution

图5 改进前后实例分割可视化结果对比

Fig.5 Comparison of visual results of instance segmentation before and after improvement

图6 Cityscapes数据集上的实例分割可视化结果对比

Fig.6 Comparison of visual results of instance segmentation on Cityscapes dataset

参考文献 34

1	TOROYAN T . Global status report on road safety. Injury Prevention, 2009, 15 (4): 286. doi: 10.1136/ip.2009.023697
2	HUVAL B, WANG T, TANDON S, et al. An empirical evaluation of deep learning on high-way driving[EB/OL]. (2015-04-17)[2023-10-24]. https://arxiv.org/abs/1504.01716.
3	JIANG Y, TAN Z, WANG J, et al. GiraffeDet: a heavy-neck paradigm for object detection[EB/OL]. (2022-02-09)[2022-05-10]. https://arxiv.org/abs/2202.04256.
4	STRUDEL R, GARCIA R, LAPTEV I, et al. Segmenter: transformer for semantic segmentation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2022: 7242-7252.
5	刘文波, 叶涛, 李颀. 基于改进SOLO v2的番茄叶部病害检测方法. 农业机械学报, 2021, 52 (8): 213- 220.
	LIU W B , YE T , LI Q . Tomato leaf disease detection method based on improved SOLO v2. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (8): 213- 220.
6	穆世义, 徐树公. 基于单字符注意力的全品类鲁棒车牌识别. 自动化学报, 2023, 49 (1): 122- 134.
	MU S Y , XU S G . Full-category robust license plate recognition based on character attention. Acta Automatica Sinica, 2023, 49 (1): 122- 134.
7	彭道刚, 陈晨, 王丹豪, 等. 基于改进YOLOv7的火电厂管道及阀门泄漏分割与检测. 控制与决策, 2024, 39 (9): 2977- 2986.
	PENG D G , CHEN C , WANG D H , et al. Leakage segmentation and detection of pipelines and valves in thermal power plants based on improved YOLOv7. Control and Decision, 2024, 39 (9): 2977- 2986.
8	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2017: 2980-2988.
9	BOLYA D, ZHOU C, XIAO F, et al. YOLACT: real-time instance segmentation[EB/OL]. (2019-04-04)[2022-05-10]. https://arxiv.org/abs/1904.02689.
10	WANG X L, KONG T, SHEN C H, et al. SOLO: segmenting objects by locations[C]//Proceedings of ECCV'20. Berlin, Germany: Springer, 2020: 649-665.
11	WANG X , ZHANG R , SHEN C , et al. SOLO: a simple framework for instance segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2022, 44 (11): 8587- 8601. URL
12	CHEN X L, GIRSHICK R, HE K M, et al. TensorMask: a foundation for dense object segmentation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2020: 2061-2069.
13	HURTIK P , MOLEK V , HULA J , et al. Poly-YOLO: higher speed, more precise detection and instance segmentation for YOLOv3. Neural Computing and Applications, 2022, 34 (10): 8275- 8290. doi: 10.1007/s00521-021-05978-9
14	JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv5[EB/OL]. (2020-06-26) [2023-10-24]. https://github.com/ultralytics/yolov5.
15	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2016: 779-788.
16	李成严, 车子轩, 郑企森. 基于特征与数据增强的城市街景实例分割算法. 哈尔滨理工大学学报, 2024, 29 (2): 25- 32.
	LI C Y , CHE Z X , ZHENG Q S . Instance segmentation algorithm of urban street scene based on data augmentation and feature enhancement. Journal of Harbin University of Science and Technology, 2024, 29 (2): 25- 32.
17	宋亮, 谷玉海, 黄佳伟. 改进SOLOv2的非结构化道路图像实例分割. 激光杂志, 2024, 45 (3): 133- 139.
	SONG L , GU Y H , HUANG J W . Improved segmentation of unstructured road image instance in SOLOv2. Laser Journal, 2024, 45 (3): 133- 139.
18	陈妍妍, 王海, 蔡英凤, 等. 基于检测的高效自动驾驶实例分割方法. 汽车工程, 2023, 45 (4): 541- 550.
	CHEN Y Y , WANG H , CAI Y F , et al. Efficient automatic driving instance segmentation method based on detection. Automotive Engineering, 2023, 45 (4): 541- 550.
19	ZHU X, HU H, LIN S, et al. Deformable ConvNets v2: more deformable, better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2019: 9308-9316.
20	OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//Proceedings of ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2023: 1-5.
21	ZHENG Z , WANG P , REN D , et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Transactions on Cybernetics, 2021, 52 (8): 8574- 8586. URL
22	TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-01-24)[2023-10-24]. https://arxiv.org/abs/2301.10051.
23	ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[J]. (2017-10-25)[2023-10-24]. https://arxiv.org/abs/1710.09412.
24	CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset[EB/OL]. [2023-10-24]. https://www.cityscapes-dataset.com/wordpress/wp-content/papercite-data/pdf/cordts2015cvprw.pdf.
25	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2020-04-23]. https://arxiv.org/abs/2004.10934.
26	GE Z, LIU S, WANG F, et al. YOLOx: exceeding YOLO series in 2021[EB/OL]. (2031-07-18) [2023-10-24]. https://arxiv.org/abs/2107.08430.
27	MAO M , ZHANG R , ZHENG H , et al. Dual-stream network for visual recognition. Advances in Neural Information Processing Systems, 2021, 34, 25346- 25358.
28	LI X , WANG W , WU L , et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 2020, 33, 21002- 21012.
29	RUBY U , YENDAPALLI V . Binary cross entropy with deep learning technique for image classification. International Journal of Advanced Trends in Computer Science and Engineering, 2020, 9 (4): 5393- 5397. URL
30	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2019: 658-666.
31	HU J , SHEN L , ALBANIE S , et al. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (8): 2011- 2023. URL
32	WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 7794-7803.
33	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of Lecture Notes in Computer Science(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Berlin, Germany: Springer, 2018: 3-19.
34	WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2020: 11534-11542.

[1]	党小超, 刘涧, 董晓辉, 祝忠彦, 李芬芳. 面向不平衡数据的机械设备故障命名实体识别[J]. 计算机工程, 2024, 50(9): 104-112.
[2]	曾湄, 王逸涵, 雷志伟, 刘雪垠, 李柏林. 基于自监督学习的葡萄实例去重叠遮挡算法[J]. 计算机工程, 2024, 50(8): 216-228.
[3]	高爽, 史轶伦, 徐巧枝, 于磊. 基于对比学习的非对称编解码结构的心脏MRI分割研究[J]. 计算机工程, 2024, 50(8): 290-300.
[4]	屠乃威, 焦猛, 阎馨. 复杂环境下输电线路鸟巢目标图像检测模型[J]. 计算机工程, 2024, 50(7): 216-226.
[5]	周秦源, 邓越平, 张磊, 张陈, 卢日荣, 胡贤哲. 融合光流与多视角几何的动态视觉SLAM系统[J]. 计算机工程, 2024, 50(5): 250-259.
[6]	杜田田, 王晓龙, 何劲. 复杂光照条件下基于光流的水运航道流速检测算法[J]. 计算机工程, 2024, 50(4): 60-67.
[7]	马明旭, 马宏, 宋华伟. 基于YOLO-Pose的城市街景小目标行人姿态估计算法[J]. 计算机工程, 2024, 50(4): 177-186.
[8]	张旭, 陈慈发, 董方敏. 基于改进YOLOv7的PCB缺陷检测算法[J]. 计算机工程, 2024, 50(12): 318-328.
[9]	朱彦斌, 王润民, 陈华, 曹小菲, 朱祯琳, 丁亚军. 基于多粒度特征增强网络的交通文本检测方法[J]. 计算机工程, 2024, 50(11): 80-88.
[10]	杨雨迪, 葛海波, 辛世澳, 薛紫涵, 袁昊. 融合超分辨率和特征增强的轻量化遥感图像小目标检测[J]. 计算机工程, 2024, 50(11): 284-296.
[11]	罗偲, 李凯扬, 吴吉花, 任鹏. 基于对抗注意力机制的水下遮挡目标检测算法[J]. 计算机工程, 2024, 50(10): 313-321.
[12]	蒋心璐, 陈天恩, 王聪, 赵春江. 大田环境下的农业害虫图像小目标检测算法[J]. 计算机工程, 2024, 50(1): 232-241.
[13]	李嘉新, 侯进, 盛博莹, 周宇航. 基于改进YOLOv5的遥感小目标检测网络[J]. 计算机工程, 2023, 49(9): 256-264.
[14]	陈露萌, 曹彦彦, 黄民, 谢鑫钢. 基于改进YOLOv5的火焰检测方法[J]. 计算机工程, 2023, 49(8): 291-301, 309.
[15]	周逸云, 万新军, 胡伏原, 陈昊. 基于联合注意与特征关联的实例分割算法[J]. 计算机工程, 2023, 49(6): 217-226.

选择文件类型/文献管理软件名称

选择包含的内容