基于CGS-Ghost YOLO的交通标志检测研究

doi:10.19678/j.issn.1000-3428.0066520

摘要/Abstract

摘要：

在交通标志检测任务中，YOLOv5检测算法在复杂的环境和路况下存在漏检、错检及模型参数量过大等问题。为此，提出一种改进的CGS-Ghost YOLO检测模型。YOLOv5在图片输入后使用Focus模块进行下采样，增加较多参数，CGS-Ghost YOLO模型使用StemBlock模块替换Focus模块进行采样，能够在维持精度的同时减少参数，并通过引入坐标注意力机制，强化特征中的语义信息和位置信息，提高模型的特征提取能力。设计SMU激活函数与组归一化相结合的CGS卷积模块，避免训练过程中Batch Size大小对模型所造成的影响，在使用GhostConv减少模型参数的同时，提升模型的检测精度。在此基础上，通过$ \alpha $-CIoU Loss+VFocal Loss损失函数，改善交通标志检测任务中正负样本不平衡的问题，提升模型整体性能，Neck部分使用Bi-FPN双向特征金字塔网络，实现检测目标多尺度特征的有效融合。实验结果表明，改进的CGS-Ghost YOLO模型在交通标志检测数据集TT100K中的平均精度均值达到93.1%，相较于原始模型提高了11.3个百分点，模型参数量相较于原始模型降低了21.2个百分点。此外，该网络模型优化了卷积层及下采样部分，在大幅减少模型参数的同时提高了模型检测精度。

关键词: 深度学习, 目标检测, YOLOv5检测算法, 注意力机制, CGS Conv模块

Abstract:

In tasks involving traffic sign detection, the YOLOv5 detection algorithm encounters several issues including missed detections, erroneous detections, and a complex model in complex environments and road conditions. To address these challenges, an improved CGS-Ghost YOLO detection model is proposed. YOLOv5 uses the focus module for sampling, which introduces more parameters. In this study, the StemBlock module is used to replace the focus module for sampling after input, which can reduce the number of parameters while maintaining the accuracy. CGS-Ghost YOLO uses a Coordinate Attention(CA) mechanism, which improves the semantic and location information within the features and enhances the feature extraction ability of the model. Additionally, a CGS convolution module, which combines the SMU activation function with GroupNorm(GN) normalization, is proposed. The CGS convolution module is designed to avoid the influence of the batch Size on the model during training and improve model performance. This study aims to use GhostConv to reduce the number of model parameters and effectively improve the detection accuracy of the model.The loss function, $ \alpha $-CIoU Loss+VFocal Loss, is used to solve the problem of unbalanced positive and negative samples in traffic sign detection tasks and improve the overall performance of the model. The neck part uses a Bi-FPN bidirectional feature pyramid network, ensuring that the multi-scale features of the detection target are effectively fused. The results of an experiment on the TT100K traffic sign detection dataset show that the detection accuracy of the improved CGS-Ghost YOLO model reaches 93.1%, which is 11.3 percentage points higher than the accuracy achieved by the original model. Additionally, the proposed network model reduces the model parameter quantity by 21.2 percentage points compared to the original model. In summary, the network model proposed in this study optimizes the convolution layer and the downsampling part, thus considerably reducing the model parameters while enhancing the model detection accuracy.

Key words: deep learning, object detection, YOLOv5 detection algorithm, attention mechanism, CGS Conv module

赵宏, 冯宇博. 基于CGS-Ghost YOLO的交通标志检测研究[J]. 计算机工程, 2023, 49(12): 194-204.

Hong ZHAO, Yubo FENG. Research on Traffic Sign Detection Based on CGS-Ghost YOLO[J]. Computer Engineering, 2023, 49(12): 194-204.

http://www.ecice06.com/CN/Y2023/V49/I12/194

图/表 19

图1 基于卷积神经网络的目标检测发展现状

Fig.1 The development status of object detection based on convolutional neural network

图2 CGS-Ghost YOLO网络结构

Fig.2 Network structure of CGS-Ghost YOLO

图3 组归一化原理

Fig.3 Group normalization principle

图4 批归一化和组归一化

Fig.4 Batch normalization and group normalization

图5 批归一化和组归一化的误差率

Fig.5 Error rates of batch normalization and group normalization

图6 StemBlock构成

Fig.6 StemBlock composition

图7 常规卷积与GhostConv原理图

Fig.7 Principle diagram of general convolution and GhostConv

图8 CA机制

Fig.8 CA mechanism

图9 Bi-FPN原理图

Fig.9 Bi-FPN prineiple diagram

图10 数据集部分图片

Fig.10 Part pictures of the dataset

图11 所有类PR曲线

Fig.11 All classes PR curves

图12 混淆矩阵

Fig.12 Confusion matrix

图13 检测结果示例

Fig.13 Example of test result

参考文献 24

1	包晓敏, 王思琪. 基于深度学习的目标检测算法综述. 传感器与微系统, 2022, 41(4): 5- 9. URL
	BAO X M, WANG S Q. Survey of object detection algorithm based on deep learning. Transducer and Microsystem Technologies, 2022, 41(4): 5- 9. URL
2	杨晋生, 杨雁南, 李天骄. 基于深度可分离卷积的交通标志识别算法. 液晶与显示, 2019, 34(12): 1191- 1201. URL
	YANG J S, YANG Y N, LI T J. Traffic sign recognition algorithm based on depthwise separable convolutions. Chinese Journal of Liquid Crystals and Displays, 2019, 34(12): 1191- 1201. URL
3	李哲, 张慧慧, 邓军勇. 基于改进Faster R-CNN的交通标志检测算法. 液晶与显示, 2021, 36(3): 484- 492. URL
	LI Z, ZHANG H H, DENG J Y. Traffic sign detection algorithm based on improved Faster R-CNN. Chinese Journal of Liquid Crystals and Displays, 2021, 36(3): 484- 492. URL
4	WANG J F, CHEN Y, DONG Z K, et al. Improved YOLOv5 network for real-time multi-scale traffic sign detection. Neural Computing and Applications, 2023, 35(10): 7853- 7865. doi: 10.1007/s00521-022-08077-5
5	JIANG L F, LIU H, ZHU H, et al. Improved YOLO v5 with balanced feature pyramid and attention module for traffic sign detection. MATEC Web of Conferences, 2022, 355, 03023. doi: 10.1051/matecconf/202235503023
6	CHEN J Z, JIA K K, CHEN W Q, et al. A real-time and high-precision method for small traffic-signs recognition. Neural Computing and Applications, 2022, 34(3): 2233- 2245. doi: 10.1007/s00521-021-06526-1
7	YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2020: 6022-6031.
8	HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904- 1916. doi: 10.1109/TPAMI.2015.2389824
9	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 936-944.
10	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8759-8768.
11	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. [2022-11-10]. https://arxiv.org/abs/1502.03167.pdf.
12	BISWAS K, KUMAR S, BANERJEE S, et al. SMU: smooth activation function for deep networks using smoothing maximum technique[EB/OL]. [2022-11-10]. https://arxiv.org/abs/2111.04682.pdf.
13	HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 1577-1586.
14	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 13708-13717.
15	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 10778-10787.
16	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression. Artificial Intelligence, 2020, 34(7): 12993- 13000.
17	HE J B, ERFANI S, MA X J, et al. Alpha-IoU: a family of power intersection over union losses for bounding box regression[EB/OL]. [2022-11-10]. https://arxiv.org/abs/2110.13675.pdf.
18	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2999-3007.
19	ZHANG H Y, WANG Y, DAYOUB F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 8510-8519.
20	ZHU Z, LIANG D, ZHANG S H, et al. Traffic-sign detection and classification in the wild[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 2110-2118.
21	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]. Berlin, Germany: Springer, 2016: 21-37.
22	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
23	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-11-10]. https://arxiv.org/abs/1804.02767.pdf.
24	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-11-10]. https://arxiv.org/abs/2004.10934.pdf.

[1]	杨静, 陆铭华, 马洁琼, 吴金平, 刘星璇. 基于交替循环神经网络的水下防御态势预测方法[J]. 计算机工程, 2023, 49(9): 69-78.
[2]	孙龙, 张荣芬, 刘宇红, 饶庭漓. 监控视角下密集人群口罩佩戴检测算法[J]. 计算机工程, 2023, 49(9): 313-320.
[3]	李嘉新, 侯进, 盛博莹, 周宇航. 基于改进YOLOv5的遥感小目标检测网络[J]. 计算机工程, 2023, 49(9): 256-264.
[4]	池亚平, 岳梓岩, 林雨衡. 基于Transformer的SM4算法工作模式识别[J]. 计算机工程, 2023, 49(9): 109-117.
[5]	苏晓东, 李世洲, 赵佳圆, 亮洪宇, 张玉荣, 徐红岩. 基于多级叠加和注意力机制的图像语义分割[J]. 计算机工程, 2023, 49(9): 265-271, 278.
[6]	林中霖, 时金桥, 王美琪, 王学宾, 王雨燕. 基于应用行为划分的Android恶意应用检测技术[J]. 计算机工程, 2023, 49(9): 125-136.
[7]	韩璐, 霍纬纲, 张永会, 刘涛. 基于多尺度特征融合与双注意力机制的多元时间序列预测[J]. 计算机工程, 2023, 49(9): 99-108.
[8]	龙玉江, 卫薇, 舒彧, 张正刚, 王道累, 李峰. 基于自适应关键点的破损旋转绝缘子检测方法[J]. 计算机工程, 2023, 49(9): 272-278.
[9]	丰芳宇, 罗晓曙, 蒙志明, 王广宇. 基于抗混叠残差注意力网络的人脸表情识别[J]. 计算机工程, 2023, 49(8): 190-198.
[10]	徐春波, 闫娟, 杨慧斌, 王博, 吴晗. 基于目标检测和语义分割的视觉SLAM算法[J]. 计算机工程, 2023, 49(8): 199-206, 214.
[11]	江雨燕, 陶承凤, 李平. 数据增强和自适应自步学习的深度子空间聚类算法[J]. 计算机工程, 2023, 49(8): 96-103, 110.
[12]	宋志娜, 李莎, 杨建明, 徐川. 基于特征与区域定位增强的遥感舰船目标检测[J]. 计算机工程, 2023, 49(8): 257-264.
[13]	王书朋, 何引弟. 融合特征注意力机制的非均匀光照图像增强算法[J]. 计算机工程, 2023, 49(8): 232-239.
[14]	刘昊鑫, 董超, 勾智楠, 高凯. 融合混合表征的小样本关系抽取方法[J]. 计算机工程, 2023, 49(8): 63-68.
[15]	李泽水, 冀俊忠, 杨翠翠. 基于边权重信息深度网络嵌入的PPIN功能模块检测[J]. 计算机工程, 2023, 49(8): 69-76.

选择文件类型/文献管理软件名称

选择包含的内容