基于改进YOLOv8的道路交通小目标车辆检测算法

doi:10.19678/j.issn.1000-3428.0069825

摘要/Abstract

摘要：

针对交通道路中小目标车辆存在的识别困难、检测精度低以及误检和漏检等问题, 提出一种基于YOLOv8算法的大内核、多尺度梯度组合的道路交通小目标车辆检测模型RGGE-YOLOv8。首先, 使用RepLayer模型替换YOLOv8网络的主干部分, 引入大内核深度可分离卷积结构, 拓展上下文信息, 以增强模型对小目标的信息捕获能力; 其次, 使用GIoU代替原损失函数, 解决IoU在预测框与真实框没有重叠时存在的无法优化问题; 然后, 引入全局注意力机制(GAM), 通过减少信息丢失并增强全局交互信息来提高网络的特征表达能力; 最后, 引入CSPNet并重参化梯度组合特征金字塔, 使得模型具有较大感受野和高形状偏差。实验结果表明, RGGE-YOLOv8在Visdrone数据集和自有数据集上mAP@0.5指标分别达到34.8%和94.7%, 相较于原始YOLOv8n算法精度分别提高了2.2和5.51百分点, 证明了RGGE-YOLOv8模型对道路小目标车辆检测的有效性。

关键词: YOLOv8, 小目标检测, 深度学习, 多尺度特征金字塔, 注意力机制

Abstract:

To address the issues of identification difficulties, low detection accuracy, misdetection, and missing detection of small target vehicles on traffic roads, this study proposes a road traffic small target vehicle detection model, RGGE-YOLOv8, based on the YOLOv8 algorithm with a large kernel and multi-scale gradient combination. First, the RepLayer model replaces the backbone of the YOLOv8 network, and depthwise separable convolution is introduced to expand the context information, thereby enhancing the ability of the model to capture information on small targets. Second, the Complete IoU loss (GIoU) replaces the original loss function to address the issue where the IoU cannot be optimized when there is no overlap. Subsequently, a Global Attention Mechanism (GAM) is introduced to improve the feature representation capability of the network by reducing information loss and enhancing global interactive information. Finally, CSPNet is incorporated, and the gradient combination feature pyramid is parameterized to ensure that the model achieves a large receptive field and high shape deviation. The experimental results indicate that the mAP@0.5 index of the improved algorithm on the Visdrone dataset and the custom dataset reaches 34.8% and 94.7%, respectively. The overall accuracy of the improved algorithm is 2.2 percentage points and 5.51 percentage points higher than that of the original YOLOv8n algorithm. These findings demonstrate the practicability of the RGGE-YOLOv8 model for small target vehicle detection on traffic roads.

Key words: YOLOv8, small target detection, deep learning, multi-scale feature pyramid, attention mechanism

火久元, 苏泓瑞, 武泽宇, 王婷娟. 基于改进YOLOv8的道路交通小目标车辆检测算法[J]. 计算机工程, 2025, 51(1): 246-257.

HUO Jiuyuan, SU Hongrui, WU Zeyu, WANG Tingjuan. Road Traffic Small Target Vehicle Detection Algorithm Based on Improved YOLOv8[J]. Computer Engineering, 2025, 51(1): 246-257.

https://www.ecice06.com/CN/Y2025/V51/I1/246

图/表 15

图1 RGGE-YOLOv8整体框架

Fig.1 Overall framework of RGGE-YOLOv8

图2 改进的Backbone结构

Fig.2 Structure of the improved Backbone

图3 GAM的结构

Fig.3 The structure of GAM

图4 CSPNet和改进ResCSP的结构

Fig.4 The structures of CSPNet and improved ResCSP

图5 ReConext网络模块结构

Fig.5 ReConext network module structure

图6 预处理前后的数据集样本

Fig.6 Dataset samples before and after preprocessing

图7 改进模型与未改进模型的对比

Fig.7 Comparison between improved model and unimproved model

图8 2种模型的混淆矩阵

Fig.8 Confusion matrix of two models

图9 2种模型的准确度对比堆叠图

Fig.9 Stacked plot of accuracy comparison results between two models

图10 交通车辆数据集上的实验结果

Fig.10 Experimental results on traffic vehicle dataset

图11 Visdrone数据集上改进模型与未改进模型的对比结果

Fig.11 Comparison results of improved and unimproved models on the Visdrone dataset

参考文献 30

1	ZOU Z, SHI Z, GUO Y, et al. Object detection in 20 years: a survey[EB/OL]. [2024-04-05]. https://arxiv.org/abs/1905.05055.
2	李嘉新, 侯进, 盛博莹, 等. 基于改进YOLOv5的遥感小目标检测网络. 计算机工程, 2023, 49(9): 256- 264. doi: 10.19678/j.issn.1000-3428.0065935
	LI J X, HOU J, SHENG B Y, et al. Remote sensing small object detection network based on improved YOLOv5. Computer Engineering, 2023, 49(9): 256- 264. doi: 10.19678/j.issn.1000-3428.0065935
3	张华美, 张皎洁. 基于人工智能的脱机手写数字识别研究综述. 南京邮电大学学报(自然科学版), 2021, 41(5): 83- 91.
	ZHANG H M, ZHANG J J. Summary of offline handwritten digit recognition research based on artificial intelligence. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2021, 41(5): 83- 91.
4	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2022: 886-893.
5	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[EB/OL]. [2024-04-05]. https://ieeexplore.ieee.org/document/6909475.
6	QIU W C, YUILLE A. UnrealCV: connectingcomputer vision to unreal engine[EB/OL]. [2024-04-05]. https://arxiv.org/abs/1609.01326.
7	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
8	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2017: 2961-2969.
9	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[EB/OL]. [2024-04-05]. https://arxiv.org/abs/1512.02325.
10	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[EB/OL]. [2024-04-05]. https://arxiv.org/abs/1506.02640.
11	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[EB/OL]. [2024-04-05]. https://ieeexplore.ieee.org/document/8100173.
12	ZHAO J W, TIAN G Z, QIU C, et al. Weed detection in potato fields based on improved YOLOv4: optimal speed and accuracy of weed detection in potato fields. Electronics, 2022, 11(22): 3709. doi: 10.3390/electronics11223709
13	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-04-05]. http://arxiv.org/abs/2004.10934v1.
14	SHEN C, MA C, GAO W. Multiple attention mechanism enhanced YOLOX for remote sensing object detection. Sensors (Basel, Switzerland), 2023, 23(3): 1261. doi: 10.3390/s23031261
15	WU Y Y, YANG T, TANG Y. Research on road object detection algorithm based on improved YOLOX[C]//Proceedings of the 3rd International Conference on Neural Networks, Information and Communication Engineering. Washington D.C., USA: IEEE Press, 2023: 271-275.
16	张正, 白佳华, 田青. 基于单级特征金字塔的图像旋转目标检测. 计算机工程与应用, 2023, 59(15): 235- 242.
	ZHANG Z, BAI J H, TIAN Q. Image rotating objects detection based on single level feature pyramid. Computer Engineering and Applications, 2023, 59(15): 235- 242.
17	LIU J, CAI Q Q, ZOU F M, et al. BiGA-YOLO: a lightweight object detection network based on YOLOv5 for autonomous driving. Electronics, 2023, 12(12): 2745. doi: 10.3390/electronics12122745
18	CHEN J, MAI H S, LUO L B, et al. Effective feature fusion network in BiFPN for small object detection[C]//Proceedings of the IEEE International Conference on Image Processing. Washington D.C., USA: IEEE Press, 2021: 699-703.
19	陈皋, 王卫华, 林丹丹. 基于无预训练卷积神经网络的红外车辆目标检测. 红外技术, 2021, 43(4): 342- 348.
	CHEN G, WANG W H, LIN D D. Infrared vehicle target detection based on convolutional neural network without pre-training. Infrared Technology, 2021, 43(4): 342- 348.
20	徐胜军, 荆扬, 李海涛, 等. 渐进式多粒度ResNet车型识别网络. 光电工程, 2023, 50(7): 36- 51.
	XU S J, JING Y, LI H T, et al. Progressive multi-granularity ResNet vehicle recognition network. Opto-Electronic Engineering, 2023, 50(7): 36- 51.
21	朱凯斌, 吕红明, 秦彦彬. 基于改进YOLOv5算法的车辆目标检测. 自动化与仪表, 2024, 39(5): 78- 83.
	ZHU K B, LÜ H M, QIN Y B. Vehicle target detection based on improved YOLOv5 algorithm. Automation & Instrumentation, 2024, 39(5): 78- 83.
22	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2019: 15-21.
23	LIU Y C, SHAO Z R, HOFFMANN N. Global attention mechanism: retain information to enhance channel-spatial interactions[EB/OL]. [2024-04-05]. http://arxiv.org/abs/2112.05561v1.
24	WANG C Y, LIAO H Y, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington D.C., USA: IEEE Press, 2020: 390-391.
25	DING X H, ZHANG X Y, HAN J G, et al. Scaling up your kernels to 31×31: revisiting large kernel design in CNNs[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2022: 11963-11975.
26	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 7132-7141.
27	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[EB/OL]. [2024-04-05]. https://arxiv.org/abs/1807.06521.
28	TOLSTIKHIN I O, HOULSBY N, KOLESNIKOV A, et al. MLP-Mixer: an all-MLP architecture for vision[EB/OL]. [2024-04-05]. https://arxiv.org/abs/2105.01601.
29	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 8759-8768.
30	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2020: 10781-10790.

[1]	罗旭东, 袁笛, 常晓军, 何震宇. 基于不确定性启发图像增强的水下目标跟踪[J]. 计算机工程, 2025, 51(1): 11-19.
[2]	周宇, 谢威, 邝得互, 江健民. 基于三元自注意力的视频快照压缩成像重建[J]. 计算机工程, 2025, 51(1): 20-30.
[3]	胡升龙, 陈彬, 张开华, 宋慧慧. 场景结构知识增强的协同显著性目标检测[J]. 计算机工程, 2025, 51(1): 31-41.
[4]	周雪阳, 傅启明, 陈建平, 陈延明, 陆悠, 王蕴哲. 基于证据和图推理的文档级关系抽取方法: 以医学关系为例[J]. 计算机工程, 2025, 51(1): 106-117.
[5]	喻勇涛, 孙奥, 李昂, 朱琳琳. 基于孪生网络的分类器输出重复性优化方法[J]. 计算机工程, 2025, 51(1): 118-127.
[6]	肖超恩, 李子凡, 张磊, 王建新, 钱思源. 基于Transformer模型与注意力机制的差分密码分析[J]. 计算机工程, 2025, 51(1): 156-163.
[7]	张会影, 圣文顺. 基于标记适应的人脸年龄识别优化算法[J]. 计算机工程, 2025, 51(1): 174-181.
[8]	杨红菊, 吉昌. 学习驱动的图像压缩算法研究[J]. 计算机工程, 2025, 51(1): 190-197.
[9]	赵南南, 高翡晨. 基于改进YOLOv8的交通场景实例分割算法[J]. 计算机工程, 2025, 51(1): 198-207.
[10]	王晓路, 汶建荣. 基于运动-时间感知的人体动作识别方法[J]. 计算机工程, 2025, 51(1): 216-224.
[11]	胡涌涛, 黄洪琼. 结合特征融合和通道注意力的多分支换装行人重识别[J]. 计算机工程, 2025, 51(1): 225-234.
[12]	郑雅洲, 刘万平, 黄东. 一种基于注意力机制的BERT-CNN-GRU检测方法[J]. 计算机工程, 2025, 51(1): 258-268.
[13]	王骞, 张俊华, 王泽彤, 李博. X2S-Net:基于双平面X线片的脊柱三维重建[J]. 计算机工程, 2025, 51(1): 277-286.
[14]	易鹏, 杨晔, 严仕嘉. 基于MPCNN模型的sEMG快速迁移学习的手势识别应用研究[J]. 计算机工程, 2025, 51(1): 304-311.
[15]	刘钟, 唐宏, 王宁喆, 朱传润. 融合RNN与稀疏自注意力的文本摘要方法[J]. 计算机工程, 2025, 51(1): 312-320.

选择文件类型/文献管理软件名称

选择包含的内容