Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (6): 297-310. doi: 10.19678/j.issn.1000-3428.0069311

• Development Research and Engineering Application • Previous Articles     Next Articles

Mask-YOLO: Improved Mask Detection Algorithm Based on YOLOv5n

LI Yi1, XU Huiying1,*(), ZHU Xinzhong1, HUANG Xiao2, WANG Shumeng1, LI Xiyu1   

  1. 1. School of Computer Science and Technology, Zhejiang Normal University, Jinhua 321004, Zhejiang, China
    2. College of Education, Zhejiang Normal University, Jinhua 321004, Zhejiang, China
  • Received:2024-01-29 Online:2025-06-15 Published:2024-06-20
  • Contact: XU Huiying

基于YOLOv5n模型改进的口罩检测算法: Mask-YOLO

李毅1, 徐慧英1,*(), 朱信忠1, 黄晓2, 王舒梦1, 李悉钰1   

  1. 1. 浙江师范大学计算机科学与技术学院, 浙江 金华 321004
    2. 浙江师范大学教育学院, 浙江 金华 321004
  • 通讯作者: 徐慧英
  • 基金资助:
    国家自然科学基金(62376252); 国家自然科学基金(61976196); 浙江省自然科学基金重点项目(LZ22F030003); 国家级大学生创新训练计划重点项目(202310345042)

Abstract:

As basic personal protection items, masks play an increasingly significant role in public health. Existing mask detection algorithms are limited by low precision in complex scenes. To improve precision and training steadiness, this study proposes an improved mask detection algorithm named Mask-YOLO based on YOLOv5n. Specifically, the Softplus activation function is applied to the feature extraction of convolutional blocks in the backbone network, making the model more efficient in reflecting non-linear data and converge faster during training. Coordinate Attention is added to the deep feature extraction backbone by embedding the position information of an object into the channel dimension, helping the model obtain more target features and channel information without high memory usage. Simultaneously, the Spatial Pyramid Pooling Fast (SPPF) module is replaced with the Receptive Field Block (RFB) module in the deep network, enlarging the receptive field of convolutional blocks by various dilation rates and obtaining rich semantic features of the object. Based on the original PANet multi-scale feature fusion process, weighted BiFPN style is introduced to fuse and exchange object features of different scale both semantically and spatially, to further improve the precision of small object detection. The Distance Intersection over Union (DIoU) regression loss function is used to solve the unsteadiness and leakage detection of the model. Finally, Soft-NMS is employed to further improve detection efficiency by reducing the confidence scores of the overlaps from the prediction bounding boxes. Experimental results show that Mask-YOLO improves mAP@0.95 by 8.58% compared with the baseline YOLOv5n, solving the problems of lower precision during object detection, unsteadiness in bounding box regression, and lower convergence during model training, and achieves high efficiency in mask detection.

Key words: object detection, mask detection, feature fusion, YOLOv5n, Feature Pyramid Network(FPN)

摘要:

口罩作为基础的个人防护物品, 在公共卫生领域发挥着重要作用。针对复杂场景下口罩检测精确度低的问题, 提出一种基于YOLOv5n改进的轻量级口罩检测算法Mask-YOLO, 以提高口罩检测精确度和模型训练的稳定性。在特征提取阶段的卷积模块组中采用Softplus激活函数, 提升模型非线性映射效率, 加快模型的收敛速度; 在主干特征提取深层网络中添加Coordinate Attention, 通过嵌入位置信息得到通道注意力, 使网络获取更大的物体区域信息和通道目标特征, 同时避免较大的内存开销; 在深层网络将快速空间金字塔池化(SPPF)模块替换为接受域模块(RFB), 借助不同的膨胀率来扩大卷积特征采样的感受野, 以获取高层网络中丰富的物体语义信息; 在多尺度特征融合网络PANet结构的基础上, 添加BiFPN跨阶段多尺度特征融合设计, 使得具有不同尺度空间信息和语义信息的目标特征充分融合交互, 进一步提升小目标检测精度; 采用DIoU作为边界框损失函数, 用以解决边界框回归不稳定和目标漏检的问题; 采用Soft-NMS的方法, 通过降低重叠检测框置信度得分的方式, 进一步提升检测效率。实验结果表明, Mask-YOLO与基准模型YOLOv5n相比, 在mAP@0.95综合评价指标上性能提升8.58%, 解决了原始YOLOv5n算法在口罩检测中小目标检测精度低、边界框回归不稳定、模型训练收敛慢等问题, 实现了高效的口罩检测。

关键词: 目标检测, 口罩检测, 特征融合, YOLOv5n, 特征金字塔网络