基于新型算子采样优化的交通标志检测网络

doi:10.19678/j.issn.1000-3428.0063188

摘要/Abstract

摘要： 传统基于卷积神经网络的交通标志检测网络采用堆叠大量卷积核的方式进行下采样，限制了卷积神经网络的感受野建模，难以灵活地调整内部参数，从而丢失图像的细节信息，导致小目标与遮挡目标的检测精度与定位精度降低。提出基于YOLOv5采样优化的交通标志检测网络。以新型算子作为基础架构，采用自卷积方式灵活提取不同通道的特征，并构建跨阶段注意力机制模块，以增加各通道特征的重要性权值，从而提高小目标的检测能力。通过改进的通道聚合网络实现多尺度语义信息与细节特征的融合与增强，同时利用K-means聚类算法生成更适合交通标志的先验框，在非极大值抑制算法中引入距离交并比函数对预测框进行后处理，避免错误抑制复杂场景下被遮挡的目标，从而提高定位精度。在中国交通标志数据集上的实验结果表明，当交并比阈值为0.5时，该网络的平均精度均值为95.8%，与YOLOv5网络相比模型参数量减少了15.7%，在满足实时性的同时具有较优的小目标检测性能。

关键词: 交通标志检测, 特征融合, 自注意力算子, 小目标, 注意力机制

Abstract: The traditional traffic sign detection network based on the Convolutional Neural Network (CNN) adopts the method of stacking many convolution kernels for downsampling.This method limits the receptive field modeling of the CNN, making it difficult to flexibly adjust the internal parameters.Thus, the details of the image are lost, resulting in decreased detection accuracy of small and occluded targets.This study proposes a traffic sign detection network based on YOLOv5 sampling optimization.With the new operator as the basic framework, the features of different channels are flexibly extracted via self-convolution.A cross stage attention mechanism module is constructed to increase the importance weight of the features of each channel to improve the detection ability for small targets.The fusion and enhancement of multi-scale semantic information and detailed features are realized through the improved Path Aggregation Network(PAN).In addition, the K-means clustering algorithm is used to generate a suitable a priori frame for traffic signs.The Distance Intersection Over Union (DIOU) function is introduced in the Non-Maximum Suppression (NMS) algorithm to postprocess the prediction frame to avoid erroneously suppressing the occluded targets in complex scenes, thereby improving the positioning accuracy.The experimental results for the Chinese traffic sign dataset show that when the Intersection Over Union(IOU) threshold is 0.5, the mean Average Precision(mAP) of the proposed network is 95.8%. Compared with the YOLOv5 network, the model parameters decreased by 15.7%, exhibiting improved small target detection performance while satisfying real-time requirements.

Key words: traffic sign detection, feature fusion, self-attention operator, small target, attention mechanism

中图分类号:

TP391

陈春辉, 马社祥. 基于新型算子采样优化的交通标志检测网络[J]. 计算机工程, 2022, 48(10): 306-312.

CHEN Chunhui, MA Shexiang. Traffic Sign Detection Network Based on New Operator Sampling Optimization[J]. Computer Engineering, 2022, 48(10): 306-312.

http://www.ecice06.com/CN/Y2022/V48/I10/306

图/表 13

参考文献

[1] DEWI C, CHEN R C, TAI S K.Evaluation of robust spatial pyramid pooling based on convolutional neural network for traffic sign recognition system[J].Electronics, 2020, 9(6):889.
[2] BOUGUEZZI S, FREDJ B H, FAIEDH H, et al.Improved architecture for traffic sign recognition using a self-regularized activation function:SigmaH[J].The Visual Computer, 2021, 37:1-18.
[3] 余超超, 侯进, 侯长征.基于显著图与傅里叶描述子的交通标志检测[J].计算机工程, 2017, 43(5):28-34. YU C C, HOU J, HOU C Z.Traffic sign detection based on saliency map and Fourier descriptor[J].Computer Engineering, 2017, 43(5):28-34.(in Chinese)
[4] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:779-788.
[5] LIU W, ANGUELOV D, ERHAN D, et al.SSD:single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision.Berlin, Germany:Springer, 2016:21-37.
[6] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[7] LIN T Y, GOYAL P, GIRSHICK R, et al.Focal loss for dense object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327.
[8] 吴涛, 王伟斌, 于力, 等.轻量级YOLOV3的绝缘子缺陷检测方法[J].计算机工程, 2019, 45(8):275-280. WU T, WANG W B, YU L, et al.Insulator defect detection method for lightweight YOLOV3[J].Computer Engineering, 2019, 45(8):275-280.(in Chinese)
[9] HE K M, ZHANG X Y, REN S Q, et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[EB/OL].[2021-10-08].https://arxiv.org/pdf/1406.4729.pdf.
[10] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:580-587.
[11] GIRSHICK R.Fast R-CNN[C]//Proceedings of International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1440-1448.
[12] REN S Q, HE K M, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[13] 刘洋, 黄大荣, 刘洋, 等.基于多颜色空间级联分类的交通标志图像颜色标准化[J].计算机工程, 2020, 46(9):233-241. LIU Y, HUANG D R, LIU Y, et al.Color standardization of traffic sign images based on multi-color space cascade classification[J].Computer Engineering, 2020, 46(9):233-241.(in Chinese)
[14] MALDONADO-BASCON S, LAFUENTE-ARROYO S, GIL-JIMENEZ P, et al.Road-sign detection and recognition based on support vector machines[J].IEEE Transactions on Intelligent Transportation Systems, 2007, 8(2):264-278.
[15] 朱双东, 刘兰兰.基于颜色信息与SVM网络的交通标志检测[J].自动化仪表, 2009, 30(3):69-72. ZHU S D, LIU L L.The traffic sign detection based on color information and SVM network[J].Process Automation Instrumentation, 2009, 30(3):69-72.(in Chinese)
[16] 李春虹, 卢宇.基于深度可分离卷积的人脸表情识别[J].计算机工程与设计, 2021, 42(5):1448-1454. LI C H, LU Y.Facial expression recognition based on depthwise separable convolution[J].Computer Engineering and Design, 2021, 42(5):1448-1454.(in Chinese)
[17] 陈红, 王相超, 陈志琳.自然场景下的交通标志检测与识别[J].电子测量技术, 2021, 44(12):102-109. CHEN H, WANG X C, CHEN Z L.Traffic sign detection and recognition in natural scene[J].Electronic Measurement Technology, 2021, 44(12):102-109.(in Chinese)
[18] ZHANG Q L, JIANG Z Q, LU Q S, et al.Split to be slim:an overlooked redundancy in vanilla convolution[EB/OL].[2021-10-08].https://arxiv.org/abs/2006.12085.
[19] LI D, HU J, WANG C H, et al.Involution:inverting the inherence of convolution for visual recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:12316-12325.
[20] LIU S, QI L, QIN H F, et al.Path aggregation network for instance segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:8759-8768.
[21] ZHENG Z H, WANG P, LIU W, et al.Distance-IoU loss:faster and better learning for bounding box regression[EB/OL].[2021-10-08].https://arxiv.org/abs/1911.08287.
[22] HU J, SHEN L, ALBANIE S.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8):7132-7141.
[23] LIN T Y, DOLLÁR P, GIRSHICK R, et al.Feature pyramid networks for object detection[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:936-944.
[24] ZHANG J M, HUANG M T, JIN X K, et al.A real-time Chinese traffic sign detection algorithm based on modified YOLOv2[J].Algorithms, 2017, 10(4):127.
[25] WANG C Y, BOCHKOVSKIY A, LIAO H Y M.Scaled-YOLOv4:scaling cross stage partial network[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:13024-13033.

选择文件类型/文献管理软件名称

选择包含的内容