基于改进YOLOv5的口罩佩戴检测算法

doi:10.19678/j.issn.1000-3428.0065701

摘要/Abstract

摘要：

在公共场合密集人群场景下，由于目标遮挡导致的信息缺失及检测目标较小、分辨率低问题，使得人脸佩戴口罩检测算法的检测效果较差。为提高模型的检测精度和速度，减少硬件占用资源，提出一种基于改进YOLOv5s的口罩佩戴检测算法。将标准卷积和深度可分离卷积相结合替换传统卷积，并进行通道混洗的鬼影混洗卷积，以在保证精度的前提下提升网络速度。将最近邻法上采样替换为轻量级通用上采样算子，充分利用特征语义信息，在改进的YOLOv5s模型Neck层末端添加自适应空间特征融合，可以对不同尺度的特征进行更好的融合，提高网络检测精度，并通过自适应图片采样，缓解数据不均衡的问题，运用马赛克数据增强对小目标进行充分利用。实验结果表明，该算法在AIZOO数据集上的mAP值达到了93%，比YOLOv5原始模型提升了2个百分点，对于佩戴口罩的人脸检测精度达到了97.7%，优于同等情况下YOLO系列、SSD、RetinaFace的检测效果，同时在GPU上的运行推理速度提升了16.7个百分点，且模型权重文件的内存仅为23.5 MB，适用于实时口罩佩戴检测。

关键词: 口罩佩戴检测, YOLOv5s模型, 鬼影混洗卷积, 自适应空间特征融合, 轻量级通用上采样算子

Abstract:

In dense crowd scenes in public places, face mask wearing detection algorithms have poor detection results because of missing information caused by target occlusion and the problems of small detection targets and low resolution. To improve the detection accuracy and speed of the model as well as to reduce the hardware footprint, an improved mask wearing detection algorithm based on YOLOv5s is proposed. The conventional convolution is replaced with Ghost-Shadowed wash Convolution(GSConv), combining Standard Convolution(SConv)and Depth-Wise separable Convolution(DWConv) with channel blending, thereby improving the network speed with guaranteed accuracy. The nearest neighbor upsampling method is replaced with a lightweight universal upsampling operator to make full use of the semantic feature information. Adaptive Spatial Feature Fusion(ASFF) is added at the end of the neck layer of the improved YOLOv5s model, which allows better fusion of features at different scales and improves the network detection accuracy.In addition, adaptive image sampling is used to alleviate the problem of data imbalance. Mosaic data enhancement is used to make full use of small targets.Experimental results show that the model achieves a mean Average Precision(mAP) value of 93% on the AIZOO dataset, a 2 percentage points improvement over the original YOLOv5 model.It achieves 97.7% detection accuracy for faces wearing masks and outperforms the detection results of the YOLO series, SSD, and RetinaFace in the same situation. It also runs on a GPU with a 16.7 percentage points inference speedup. The model weights file uses 23.5 MB memory for real-time mask wearing detection.

Key words: mask wearing detection, YOLOv5s model, Ghost-Shadowed wash Convolution(GSConv), Adaptive Spatial Feature Fusion(ASFF), lightweight universal upsampling operator

张欣怡, 张飞, 郝斌, 高鹭, 任晓颖. 基于改进YOLOv5的口罩佩戴检测算法[J]. 计算机工程, 2023, 49(8): 265-274.

Xinyi ZHANG, Fei ZHANG, Bin HAO, Lu GAO, Xiaoying REN. Mask Wearing Detection Algorithm Based on Improved YOLOv5[J]. Computer Engineering, 2023, 49(8): 265-274.

https://www.ecice06.com/CN/Y2023/V49/I8/265

图/表 17

图1 YOLOv5s 6.0模型的结构

Fig.1 Structure of YOLOv5s 6.0 model

图2 改进YOLOv5s模型的结构

Fig.2 Structure of the improved YOLOv5s model

图3 GSConv结构

Fig.3 Structure of GSConv

图4 通道混洗过程

Fig.4 Process of channel Shuffle

图5 CARAFE结构

Fig.5 Structure of CARAFE

图6 ASFF结构

Fig.6 Structure of ASFF

图7 AIZOO数据集部分图片

Fig.7 Partial pictures of AIZOO dataset

图8 AIZOO数据集标签大小分布

Fig.8 Label size distribution of AIZOO dataset

图9 阿里天池数据集标签大小分布

Fig.9 Label size distribution of Ali Tianchi dataset

图10 马赛克数据增强后的图片

Fig.10 Image after Mosaic data enhancement

表1 模型训练参数

Table 1 Model training parameters

参数名称	参数值
输入分辨率/像素	640×640
初始学习率	0.01
循环学习率	0.02
权重衰减系数	0.000 5
学习率动量	0.937
热身训练数/次	3.0
检测框位置损失系数	0.05
分类损失系数	0.5
执行都损失系数	0.5
训练数/次	150
批处理数/个	16
锚框	[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]

图11 AIZOO数据集训练结果

Fig.11 Training result of AIZOO dataset

图12 阿里天池数据集训练结果

Fig.12 Training result of Ali Tianchi dataset

表2 两种数据集下不同算法的性能对比

Table 2 Performance comparison of different algorithms in two data sets %

数据集	算法	AP		mAP(50)
数据集	算法	Mask	No-mask	mAP(50)
AIZOO	YOLOv3	91.2	80.9	86.1
	YOLOv4	90.3	87.4	88.9
	SSD	96.9	64.8	80.8
	RetinaFace	92.7	87.9	90.3
	YOLOv5s	95.9	86.0	91.0
	本文算法	97.7	88.2	93.0
阿里天池	YOLOv3	89.2	73.1	81.2
	YOLOv4	89.9	73.9	82.0
	SSD	87.4	56.9	72.1
	RetinaFace	90.8	78.1	84.5
	YOLOv5s	94.2	75.8	85.0
	本文算法	94.7	82.9	88.8

表3 消融实验结果

Table 3 Ablation experiment results

GSConv	CARAFE	ASFF	mAP(50)/%	mAP(50∶95) /%	推断时间/ms
			91.0	63.7	1.8
√			92.6	65.7	1.6
	√		92.7	65.5	1.8
		√	92.6	65.5	1.8
	√	√	92.7	65.9	2.2
√		√	92.9	65.9	1.9
√	√		92.9	65.5	1.7
√	√	√	93.0	66.0	1.5

表4 本文算法与YOLOv5s算法的性能对比结果

Table 4 Results of performance comparison between the proposed algorithm and YOLOv5s algorithm

数据集	算法	模型大小/MB	mAP(50)/%	mAP(50∶95)/%	推断时间/ms
AIZOO	YOLOv5s	24.1	91.0	63.7	1.8
AIZOO	本文算法	23.5	93.0	66.0	1.5
阿里天池	YOLOv5s	13.9	85.0	45.1	3.2
阿里天池	本文算法	13.7	88.8	48.8	1.8

图13 YOLOv5s算法和本文算法检测效果对比

Fig.13 Comparison of detection effect between YOLOv5s algorithm and the proposed algorithm

参考文献 27

1	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
2	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2980-2988.
3	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 779-788.
4	H N S, H N L, H N P, et al. Detection and localization of mask occluded faces by transfer learning using faster RCNN. SSRN Electronic Journal, 2021, 45(6): 3320- 3335.
5	徐东东. 基于深度学习的人脸口罩检测与识别[D]. 无锡: 江南大学, 2022.
	XU D D. Face mask detection and recognition based on deep learning[D]. Wuxi: Jiangnan University, 2022. (in Chinese)
6	万子伦. 基于改进Faster-RCNN的多尺度人脸口罩检测算法研究[D]. 开封: 河南大学, 2022.
	WAN Z L. Research on multi-scale face mask detection algorithm based on improved Faster-RCNN[D]. Kaifeng: Henan University, 2022. (in Chinese)
7	王克丽, 景运革. 基于YOLO的人脸口罩检测. 运城学院学报, 2022, 40(3): 60- 64. URL
	WANG K L, JING Y G. Face mask detection based on YOLO. Journal of Yuncheng University, 2022, 40(3): 60- 64. URL
8	薄景文, 张春堂. 基于YOLOv3的轻量化口罩佩戴检测算法. 电子测量技术, 2021, 44(23): 105- 110. URL
	BO J W, ZHANG C T. Lightweight mask wearing detection algorithm based on YOLOv3. Electronic Measurement Technology, 2021, 44(23): 105- 110. URL
9	叶茂, 马杰, 王倩, 等. 多尺度特征融合的轻量化口罩佩戴检测算法. 计算机工程, 2022, 48(7): 42- 50. URL
	YE M, MA J, WANG Q, et al. Lightweight mask-wearing detection algorithm with multi-scale feature fusion. Computer Engineering, 2022, 48(7): 42- 50. URL
10	程长文, 陈玮, 陈劲宏, 等. 改进YOLO的口罩佩戴实时检测方法. 电子科技, 2023, 36(2): 73- 80. URL
	CHENG C W, CHEN W, CHEN J H, et al. YOLO-improve detection method of real-time mask wearing. Electronic Science and Technology, 2023, 36(2): 73- 80. URL
11	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 6517-6525.
12	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-08-01]. https://arxiv.org/abs/1804.02767.
13	BOCHKOVSKIY A, WANG C Y, LIAO H Y. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-08-01]. https://arxiv.org/abs/2004.10934.
14	ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 2778-2788.
15	WANG C Y, LIAO H Y, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 1571-1580.
16	HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904- 1916. doi: 10.1109/TPAMI.2015.2389824
17	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 936-944.
18	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8759-8768.
19	HU J Y, LIU B J, PENG S H. Forecasting salinity time series using RF and ELM approaches coupled with decomposition techniques. Stochastic Environmental Research and Risk Assessment, 2019, 33(4/5/6): 1117- 1135.
20	SIFRE L, MALLAT S. Rigid-motion scattering for texture classification[EB/OL]. [2022-08-01]. https://arxiv.org/abs/1403.1687.
21	LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[EB/OL]. [2022-08-01]. https://arxiv.org/abs/2206.02424.
22	IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size[EB/OL]. [2022-08-01]. https://arxiv.org/abs/1602.07360.
23	WANG J Q, CHEN K, XU R, et al. CARAFE: content-aware Reassembly of FEatures[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2020: 3007-3016.
24	LIU S T, HUANG D, WANG Y H. Learning spatial fusion for single-shot object detection[EB/OL]. [2022-08-01]. https://arxiv.org/abs/1911.09516.
25	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[M]. Berlin, Germany: Springer, 2014: 740-755.
26	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
27	DENG J K, GUO J, ZHOU Y X, et al. RetinaFace: single-stage dense face localisation in the wild[EB/OL]. [2022-08-01]. https://arxiv.org/abs/1905.00641.

[1]	屠乃威, 焦猛, 阎馨. 复杂环境下输电线路鸟巢目标图像检测模型[J]. 计算机工程, 2024, 50(7): 216-226.
[2]	孙龙, 张荣芬, 刘宇红, 饶庭漓. 监控视角下密集人群口罩佩戴检测算法[J]. 计算机工程, 2023, 49(9): 313-320.
[3]	李雨阳, 沈记全, 翟海霞, 冯伟华. 基于改进SSD的口罩佩戴检测算法[J]. 计算机工程, 2022, 48(8): 173-179,186.
[4]	叶茂, 马杰, 王倩, 武麟. 多尺度特征融合的轻量化口罩佩戴检测算法[J]. 计算机工程, 2022, 48(7): 42-50.
[5]	彭成, 张乔虹, 唐朝晖, 桂卫华. 基于YOLOv5增强模型的口罩佩戴检测方法研究[J]. 计算机工程, 2022, 48(4): 39-49.
[6]	杨毅, 桑庆兵. 多尺度特征自适应融合的轻量化织物瑕疵检测[J]. 计算机工程, 2022, 48(12): 288-295.
[7]	王艺皓, 丁洪伟, 李波, 杨志军, 杨俊东. 复杂场景下基于改进YOLOv3的口罩佩戴检测算法[J]. 计算机工程, 2020, 46(11): 12-22.

选择文件类型/文献管理软件名称

选择包含的内容