作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (1): 198-207. doi: 10.19678/j.issn.1000-3428.0068677

• 图形图像处理 • 上一篇    下一篇

基于改进YOLOv8的交通场景实例分割算法

赵南南*(), 高翡晨   

  1. 西安建筑科技大学机电工程学院, 陕西 西安 710055
  • 收稿日期:2023-10-24 出版日期:2025-01-15 发布日期:2025-02-11
  • 通讯作者: 赵南南
  • 基金资助:
    陕西省自然科学基础研究计划(2019JM-443)

Improved YOLOv8-based Algorithm for Instance Segmentation in Traffic Scenes

ZHAO Nannan*(), GAO Feichen   

  1. School of Mechanical and Electrical Engineering, Xi'an University of Architecture and Technology, Xi'an 710055, Shaanxi, China
  • Received:2023-10-24 Online:2025-01-15 Published:2025-02-11
  • Contact: ZHAO Nannan

摘要:

提出一种基于改进型YOLOv8的实例分割算法(DE-YOLO)。为减少图像中复杂背景的干扰, 引入高效多尺度注意力机制, 跨维交互使各特征组内空间语义特征平均分布。在主干网络部分, 使用可变形卷积DCNv2结合C2f卷积层, 突破原始卷积限制, 提升可变性。为减小有害梯度并提升检测器精度, 采用动态非单调聚焦机制Wise-交并比(WIoU)替代联合完全交并(CIoU)损失函数进行质量评估, 优化检测框定位, 提升分割精度。同时, 通过开启Mixup数据增强处理, 充实数据集, 丰富训练特征, 提升模型学习能力。实验结果表明, DE-YOLO在城市景观数据集Cityscapes中的掩模平均精度均值(mAPmask)较基准模型YOLOv8n-seg提高了2.0百分点, IoU阈值为0.5时的平均精度提升了3.2百分点, 所提算法在提升精度的同时, 保持了优良的检测速度和较少的参数量, 模型参数量较同类模型低2.2~31.3百分点。

关键词: YOLOv8网络, 实例分割, 高效多尺度注意力, 可变形卷积, 损失函数

Abstract:

An instance segmentation algorithm (DE-YOLO) based on the improved YOLOv8 is proposed. To decrease the effect of complex backgrounds in the images, efficient multiscale attention is introduced, and cross-dimensional interaction ensures an even spatial feature distribution within each feature group. In the backbone network, a deformable convolution using DCNv2 is combined with a C2f convolutional layer to overcome the limitations of traditional convolutions and increase flexibility. This is performed to reduce harmful gradient effects and improve the overall accuracy of the detector. The dynamic nonmonotonic Wise-Intersection-over-Union (WIoU) focusing mechanism is employed instead of the traditional Complete Intersection-over-Union (CIoU) loss function to evaluate the quality, optimize detection frame positioning, and improve segmentation accuracy. Meanwhile, Mixup data enhancement processing is enabled to enrich the training features of the dataset and improve the learning ability of the model. The experimental results demonstrate that DE-YOLO improves the mean Average Precision of mask(mAPmask) and mAPmask@0.5 by 2.0 and 3.2 percentage points compared with the benchmark model YOLOv8n-seg in the Cityscapes dataset of urban landscapes, respectively. Furthermore, DE-YOLO maintains an excellent detection speed and small parameter quantity while exhibiting improved accuracy, with the model requiring 2.2-31.3 percentage points fewer parameters than similar models.

Key words: YOLOv8 network, instance segmentation, efficient multi-scale attention, deformable convolution, loss function