基于弱语义分割的轻量化交通标志检测网络

doi:10.19678/j.issn.1000-3428.0062671

摘要/Abstract

摘要： 针对现有网络在检测高分辨率交通标志图片时速度过慢、精确度较低等问题，提出一种轻量化交通标志检测网络。在MobileNetv3-Large基础上对YOLOv4网络的骨干部分进行优化，针对数据集的特点舍弃部分耗时层，更改第8层和第14层的输出通道数，并改进基础模块中通道域注意力网络的注意力机制，使输出的权重数值能更准确地表征特征的重要程度。在检测头前加入基于弱语义分割的动态增强附件，利用其输出作为空间权重分布来矫正激活区域，以避免提取能力下降导致误检、漏检问题，最终构成YOLOv4-SLite网络。采用滑窗剪裁的方法对高分辨率图片进行训练和预测，从而减少训练时间及增加样本的多样性。在TT100K交通标志数据集上的实验结果表明，相较于YOLOv4基准网络，YOLOv4-SLite网络的mAP@0.5仅下降了0.2%，但模型大小减少了96.5%，响应速度提升了227%，精确度与速度的平衡效果达到了预期。

关键词: 交通标志检测, YOLOv4网络, 轻量化网络, 弱语义分割, 注意力机制

Abstract: Aiming at the problems of slow speed and low accuracy in detecting high-resolution traffic sign images in existing networks, a lightweight traffic sign-detection network is proposed.On the basis of MobileNetv3-Large, this study optimizes the backbone of a YOLOv4 network, discards some time-consuming layers according to the characteristics of the dataset, changes the number of output channels of layers 8 and 14, and improves the attention mechanism of Squeeze and Excitation Network (SENet) in the basic module, so that the weight value of the output can more accurately represent the importance of the characteristics.This study adds a dynamic enhanced attachment based on weak semantic segmentation in front of the detection header, and uses its output as the spatial weight distribution to correct the active region, to avoid the problem of false detection and missed detection caused by the decline of extraction ability, and finally form a YOLOv4-SLite network.The sliding window clipping method is used to train and predict high-resolution images, to reduce the training time and increase the diversity of samples.The experimental results on the TT100K traffic sign dataset show that, compared with the YOLOv4 benchmark network, the mAP@0.5 of the YOLOv4-SLite network is lost by 0.2%, but the model size is reduced by 96.5%, and the response speed is increased by 227%.The balance of accuracy and speed achieved meets the expectation.

Key words: traffic sign detection, YOLOv4 network, lightweight network, weak semantic segmentation, attention mechanism

中图分类号:

TP391

曾雷鸣, 侯进, 陈子锐, 周浩然. 基于弱语义分割的轻量化交通标志检测网络[J]. 计算机工程, 2022, 48(9): 269-276,285.

ZENG Leiming, HOU Jin, CHEN Zirui, ZHOU Haoran. Lightweight Traffic Sign Detection Network Based on Weak Semantic Segmentation[J]. Computer Engineering, 2022, 48(9): 269-276,285.

https://www.ecice06.com/CN/Y2022/V48/I9/269

图/表 16

20221017111212

20221017111216

20221017111220

20221017111224

20221017111227

20221017111231

20221017111234

20221017111238

20221017111241

20221017111244

20221017111248

20221017111252

20221017111255

20221017111259

20221017111302

20221017111307

参考文献

[1] DALAL N, TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2005:886-893.
[2] LEE T S.Image representation using 2D Gabor wavelets[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(10):959-971.
[3] VIOLA P, JONES M.Rapid object detection using a boosted cascade of simple features[C]//Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2001:54-62.
[4] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90.
[5] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:580-587.
[6] GIRSHICK R.Fast R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1440-1448.
[7] REN S, HE K, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[8] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:779-788.
[9] LIU W, ANGUELOV D, ERHAN D, et al.SSD single shot multibox detector[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2016:121-129.
[10] LIN T Y, DOLLÁR P, GIRSHICK R, et al.Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:936-944.
[11] LIN T Y, GOYAL P, GIRSHICK R, et al.Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:2999-3007.
[12] ZHU Z, LIANG D, ZHANG S H, et al.Traffic-sign detection and classification in the wild[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:2110-2118.
[13] 郭璠, 张泳祥, 唐琎, 等.YOLOv3-A:基于注意力机制的交通标志检测网络[J].通信学报, 2021, 42(1):87-99. GUO F, ZHANG Y X, TANG J, et al.YOLOv3-A:a traffic sign detection network based on attention mechanism[J].Journal on Communications, 2021, 42(1):87-99.(in Chinese)
[14] BOCHKOVSKIY A, WANG C Y, LIAO H Y M.YOLOv4:optimal speed and accuracy of object detection[EB/OL].[2021-08-01].https://arxiv.org/abs/2004.10934.
[15] WANG C Y, MARK LIAO H Y, WU Y H, et al.CSPNet:a new backbone that can enhance learning capability of CNN[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Washington D.C., USA:IEEE Press, 2020:1571-1580.
[16] HOWARD A, SANDLER M, CHEN B, et al.Searching for MobileNetv3[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:1314-1324.
[17] HU J, SHEN L, ALBANIE S, et al.Squeeze-and-excitation networks[C]//Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence.Washington D.C., USA:IEEE Press, 2018:2011-2023.
[18] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[19] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2021-08-01].https://arxiv.org/abs/1804.02767.
[20] WANG W H, XIE E Z, SONG X G, et al.Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:8439-8448.
[21] MISRA D.Mish:a self regularized non-monotonic neural activation function[EB/OL].[2021-08-01].https://arxiv.org/abs/1908.08681.
[22] HOWARD A G, ZHU M L, CHEN B, et al.MobileNets:efficient convolutional neural networks for mobile vision applications[EB/OL].[2021-08-01].https://arxiv.org/abs/1704.04861.
[23] LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:3431-3440.
[24] HE K M, GKIOXARI G, DOLLÁR P, et al.Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:2980-2988.
[25] LIN T Y, MAIRE M, BELONGIE S, et al.Microsoft COCO:common objects in context[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2014:740-755.
[26] JIANG Z C, ZHAO L Q, LI S Y, et al.Real-time object detection method based on improved YOLOv4-tiny[EB/OL].[2021-08-01].https://arxiv.org/abs/2011.04244.
[27] LI Y T, HUANG H S, XIE Q S, et al.Research on a surface defect detection algorithm based on MobileNet-SSD[J].Applied Sciences, 2018, 8(9):1678-1683.
[28] WANG F, JIANG M Q, QIAN C, et al.Residual attention network for image classification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6450-6458.
[29] CHEN L C, ZHU Y K, PAPANDREOU G, et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2018:833-851.
[30] 周勇, 陈思霖, 赵佳琦, 等.基于弱语义注意力的遥感图像可解释目标检测[J].电子学报, 2021, 49(4):679-689. ZHOU Y, CHEN S L, ZHAO J Q, et al.Weakly semantic based attention network for interpretable object detection in remote sensing imagery[J].Acta Electronica Sinica, 2021, 49(4):679-689.(in Chinese)
[31] VAN ETTEN A.You only look twice:rapid multi-scale object detection in satellite imagery[EB/OL].[2021-08-01].https://arxiv.org/abs/1805.09512.
[32] CAI Z W, VASCONCELOS N.Cascade R-CNN:delving into high quality object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:6154-6162.
[33] ZHANG S F, CHI C, YAO Y Q, et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:9756-9765.
[34] KIM K, LEE H S.Probabilistic anchor assignment with IoU prediction for object detection[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2020:355-371.
[35] GE Z, LIU S T, WANG F, et al.YOLOX:exceeding YOLO series in 2021[EB/OL].[2021-08-01].https://arxiv.org/abs/2107.08430.

选择文件类型/文献管理软件名称

选择包含的内容