基于改进YOLOX-m的安全帽佩戴检测

doi:10.19678/j.issn.1000-3428.0067820

摘要/Abstract

摘要：

安全帽佩戴检测是安全监控系统中的重要组成部分，其检测精度取决于目标分类、小目标检测、域迁移差异等因素。针对现有基于YOLOX-m模型的安全帽佩戴检测算法通常存在分类精度较低、检测目标不完整、轻量化模型性能下降等问题，构建一种基于多阶段网络训练策略的改进YOLOX-m模型。首先对YOLOX-m主干特征网络卷积块的堆叠次数进行重新设计，在减小网络规模的同时最大化模型性能，然后将残差化重参视觉几何组与快速空间金字塔池化相结合，提高检测精度和推理速度。设计一种多阶段网络训练策略，将训练集和测试集拆分成多个组，并结合推理阶段生成的伪标签进行多次网络训练，以减少域迁移差异，获得更高的检测精度。实验结果表明，与YOLOX-m模型相比，改进YOLOX-m模型的推理延迟降低了5 ms，模型大小减少了4.7 MB，检测精度提高了1.26个百分点。

关键词: 安全帽佩戴检测, 深度学习, 残差化重参视觉几何组, 快速空间金字塔池化, 多阶段网络训练策略

Abstract:

The safety helmet wearing detection is a crucial part of the security monitoring system. Its precision depends on object classification, small-object detection, domain transfer discrepancy, and other factors. Existing algorithms based on YOLOX-m for safety helmet wearing detection have drawbacks of reduced classification precision, incomplete detection targets, and degraded performance of lightweight models. An improved YOLOX-m model based on a multi-stage network training strategy is proposed to solve these problems. First, the number of stacks of convolution blocks of the YOLOX-m backbone feature network is redesigned to maximize the performance of the model while reducing the network. Next, the Residual Re-parameterized Visual Geometry Group(Res-RepVGG) is combined with Spatial Pyramid Pooling-Fast(SPPF) to improve the detection accuracy and reasoning speed. In addition, a multi-stage network training strategy is proposed, which divides the training and test sets into multiple groups and combines the pseudo labels generated in the inference stage for multiple network training to reduce the domain transfer difference and improve the detection accuracy. The experimental results show that compared with YOLOX-m, the improved YOLOX-m exhibits improved performance in helmet wearing detection in three aspects: the delay is reduced by 5 ms, the model size is reduced by 4.7 MB, and the average accuracy is improved by 1.26 percentage points.

Key words: safety helmet wearing detection, deep learning, Residual Re-parameterized Visual Geometry Group(Res-RepVGG), Spatial Pyramid Pooling-Fast(SPPF), multi-stage network training strategy

王晓龙, 江波. 基于改进YOLOX-m的安全帽佩戴检测[J]. 计算机工程, 2023, 49(12): 252-261.

Xiaolong WANG, Bo JIANG. Safety Helmet Wearing Detection Based on Improved YOLOX-m[J]. Computer Engineering, 2023, 49(12): 252-261.

http://www.ecice06.com/CN/Y2023/V49/I12/252

图/表 19

图1 YOLOX-m网络结构

Fig.1 Structure of the YOLOX-m network

图2 简化的YOLOX-m网络结构

Fig.2 Simplified structure of the YOLOX-m network

图3 Res-RepVGG网络结构

Fig.3 Structure of the Res-RepVGG network

图4 SPPF结构

Fig.4 Structure of the SPPF

图5 Res-RepVGG和SPPF对主干特征网络的改进

Fig.5 Improvement of the backbone feature network by the Res-RepVGG and SPPF

图6 YOLOX-m改进前后的检测效果对比

Fig.6 Comparison of detection effect before and after the YOLOX-m improvement

图7 多阶段网络训练策略

Fig.7 Multi-stage network training strategy

图8 安全帽数据集类别数据分布

Fig.8 Category data distribution of safety helmet dataset

图9 改进模型的训练过程

Fig.9 Training process of the improved model

图10 W-YOLOX与3种模型的检测效果对比

Fig.10 Comparison of detection effects between W-YOLOX and three models

图11 非均匀拆分的多阶段网络训练实验结果（S=3）

Fig.11 Experiment results of multi-stage network training with non-uniform split(S=3)

参考文献 27

1	QI F, LI H, LUO X C, et al. Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Automation in Construction, 2018, 85, 1- 9. doi: 10.1016/j.autcon.2017.09.018
2	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
3	毕林, 谢伟, 崔君. 基于卷积神经网络的矿工安全帽佩戴识别研究. 黄金科学技术, 2017, 25(4): 73- 80. URL
	BI L, XIE W, CUI J. Based on Convolutional Neural Network. Gold Science and Technology, 2017, 25(4): 73- 80. URL
4	WU J X, CAI N, CHEN W J, et al. Automatic detection of hardhats worn by construction personnel: a deep learning approach and benchmark dataset. Automation in Construction, 2019, 106, 102894. doi: 10.1016/j.autcon.2019.102894
5	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2016.
6	黄愉文, 潘迪夫. 基于并行双路卷积神经网络的安全帽识别. 企业技术开发(学术版), 2018, 37(3): 24-27, 47. URL
	HUANG Y W, PAN D F. Helmet recognition based on parallel double convolutional neural networks. Technological Development of Enterprise, 2018, 37(3): 24-27, 47. URL
7	ZHAO Z H, YANG S P, MA Z Q. License plate character recognition based on convolutional neural network LeNet5. Journal of System Simulation, 2010, 22(3): 638- 641.
8	陈志韬, 殷恺铭, 张洋, 等. 基于EfficientDet的安全帽佩戴检测研究. 信息技术与标准化, 2021, 6(1): 19- 23. URL
	CHEN Z T, YIN K M, ZHANG Y, et al. Safety helmet wear test study based on EfficientDet. Information Technology and Standardization, 2021, 6(1): 19- 23. URL
9	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 10778-10787.
10	徐守坤, 倪楚涵, 吉晨晨, 等. 基于YOLOv3的施工场景安全帽佩戴的图像描述. 计算机科学, 2020, 47(8): 233- 240. URL
	XU S K, NI C H, JI C C, et al. Image caption of safety helmets wearing in construction scene based on YOLOv3. Computer Science, 2020, 47(8): 233- 240. URL
11	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-05-11]. https://arxiv.org/abs/1804.02767.pdf.
12	徐先峰, 赵万福, 邹浩泉, 等. 基于MobileNet-SSD的安全帽佩戴检测算法. 计算机工程, 2021, 47(10): 298-305, 313. URL
	XU X F, ZHAO W F, ZOU H Q, et al. Detection algorithm of safety helmet wear based on MobileNet-SSD. Computer Engineering, 2021, 47(10): 298-305, 313. URL
13	HOU Y Z, ZHENG L. Visualizing adapted knowledge in domain transfer[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 13819-13828.
14	GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-05-11]. https://arxiv.org/abs/2107.08430.pdf.
15	DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style ConvNets great again[EB/OL]. [2023-05-11]. https://arxiv.org/abs/2101.03697.pdf.
16	ZHANG S F, CHI C, YAO Y Q, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 9756-9765.
17	HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904- 1916.
18	李伟, 霍雪松, 张明, 等. 基于残差全连接神经网络的电力监控系统异常行为检测方法. 东南大学学报(自然科学版), 2020, 50(6): 1062- 1068. URL
	LI W, HUO X S, ZHANG M, et al. Abnormal behavior detection method for power monitoring system based on fully connected residual neural network. Journal of Southeast University(Natural Science Edition), 2020, 50(6): 1062- 1068. URL
19	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 770-778.
20	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2023-05-11]. https://arxiv.org/abs/1409.1556.pdf.
21	ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[EB/OL]. [2023-05-11]. https://arxiv.org/abs/1702.03118.pdf.
22	MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]//Proceedings of the 15th European Conference on Computer Vision. New York, USA: ACM Press, 2018: 122-138.
23	WU Y, CHEN Y P, YUAN L, et al. Rethinking classification and localization for object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 10183-10192.
24	HSU C C, TSAI Y H, LIN Y Y, et al. Every pixel matters: center-aware feature alignment for domain adaptive object detector[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 733-748.
25	LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[EB/OL]. [2023-05-11]. https://arxiv.org/abs/1711.05101.pdf.
26	ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2023-05-11]. https://arxiv.org/abs/1710.09412.pdf.
27	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2023-05-11]. https://arxiv.org/abs/2207.02696.pdf.

[1]	池亚平, 岳梓岩, 林雨衡. 基于Transformer的SM4算法工作模式识别[J]. 计算机工程, 2023, 49(9): 109-117.
[2]	林中霖, 时金桥, 王美琪, 王学宾, 王雨燕. 基于应用行为划分的Android恶意应用检测技术[J]. 计算机工程, 2023, 49(9): 125-136.
[3]	江雨燕, 陶承凤, 李平. 数据增强和自适应自步学习的深度子空间聚类算法[J]. 计算机工程, 2023, 49(8): 96-103, 110.
[4]	李泽水, 冀俊忠, 杨翠翠. 基于边权重信息深度网络嵌入的PPIN功能模块检测[J]. 计算机工程, 2023, 49(8): 69-76.
[5]	王可铮, 徐玉芬, 周尚波. 结合对比感知损失和融合注意力的图像去雾模型[J]. 计算机工程, 2023, 49(8): 207-214.
[6]	刘俊豪, 王美林, 谢兴, 宋烨兴, 许莉花. 基于改进YOLOv5的皮革瑕疵检测算法[J]. 计算机工程, 2023, 49(8): 240-249.
[7]	闫兴亚, 匡娅茜, 白光睿, 李月. 基于深度学习的学生课堂行为识别方法[J]. 计算机工程, 2023, 49(7): 251-258.
[8]	李军侠, 王星驰, 殷梓, 石德硕. 边缘深度挖掘的弱监督显著性目标检测[J]. 计算机工程, 2023, 49(7): 169-178.
[9]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.
[10]	席建锐, 唐红梅, 梁春阳, 刘鑫. 基于改进隐函数的点云物体重建[J]. 计算机工程, 2023, 49(7): 214-222.
[11]	齐咏生, 杜晓旭, 朱俊峰, 高胜利, 刘利强. 基于增强型轻量深度网络的牧区牲畜高效检测[J]. 计算机工程, 2023, 49(7): 278-287.
[12]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
[13]	张博旭, 蒲智, 程曦. 基于提示学习的维吾尔语文本分类研究[J]. 计算机工程, 2023, 49(6): 292-299,313.
[14]	于海洋, 景鹏, 张文涛, 谢赛飞, 滑志华, 宋草原. 基于残差与注意力机制的道路裂缝检测U-Net改进模型[J]. 计算机工程, 2023, 49(6): 265-273.
[15]	王爱玲, 马文臻, 邹自明, 钟佳. 基于领域自适应的卫星工程参数异常检测[J]. 计算机工程, 2023, 49(5): 29-37,47.

选择文件类型/文献管理软件名称

选择包含的内容