基于半监督学习的非结构化道路缺陷检测算法

doi:10.19678/j.issn.1000-3428.0069534

摘要/Abstract

摘要：

非结构化道路的缺陷目标检测任务对道路交通安全具有重要意义，但检测所需的标注数据集相对有限。为了解决非结构化道路标注数据集缺乏以及现有模型对无标注数据学习能力不足的问题，提出一种MAM(Multi-Augmentation with Memory)半监督目标检测算法。首先，引入缓存机制存储无标注图像和带有伪标注图像的框回归位置信息，避免了后续匹配造成的计算资源浪费。其次，设计混合数据增强策略，将缓存的伪标签图像与无标签图像混合输入学生模型，以增强模型对新数据的泛化能力，并使图像的尺度分布更加均衡。MAM算法不受目标检测模型的限制，并且更好地保持了目标框的一致性，避免了计算一致性损失。实验结果表明，MAM算法相比其他全监督学习和半监督学习算法更具优越性，在自建的非结构化道路缺陷数据集Defect上，在标注比例为10%、20%和30%的场景下，MAM算法的均值平均精度(mAP)相比于Soft Teacher算法分别提升了6.8、11.1和6.0百分点，在自建的非结构化道路坑洼数据集Pothole上，在标注比例为15%和30%的场景下，MAM算法的mAP相比于Soft Teacher算法分别提升了5.8和4.3百分点。

关键词: 非结构化道路, 缺陷目标检测, 半监督学习, 伪标签, 缓存机制, 混合数据增强

Abstract:

Detecting defects on unstructured roads is important for road traffic safety; however, annotated datasets required for detection is limited. This study proposes the Multi-Augmentation with Memory (MAM) semi-supervised object detection algorithm to address the lack of annotated datasets for unstructured roads and the inability of existing models to learn from unlabeled data. First, a cache mechanism is introduced to store the positions of the bounding box regression information for unannotated images and images with pseudo annotations, avoiding computational resource wastage caused by subsequent matching. Second, the study proposes a hybrid data augmentation strategy that mixes the cached pseudo-labeled images with unlabeled images inputted into the student model, to enhance the model′s generalizability to new data and balance the scale distribution of images. The MAM semi-supervised object detection algorithm is not limited by the object detection model and better maintains the consistency of object bounding boxes, thus avoiding the need to compute consistency loss. Experimental results show that the MAM algorithm is superior to other fully supervised and semi-supervised learning algorithms. On a self-built unstructured road defect dataset, called Defect, the MAM algorithm achieves improvements of 6.8, 11.1, and 6.0 percentage points in terms of mean Average Precision (mAP) compared to those of the Soft Teacher algorithm in scenarios with annotation ratios of 10%, 20%, and 30%, respectively. On a self-built unstructured road pothole dataset, called Pothole, the MAM algorithm achieves mAP improvements of 5.8 and 4.3 percentage points compared to those of the Soft Teacher algorithm in scenarios with annotation ratios of 15% and 30%, respectively.

Key words: unstructured road, defect object detection, semi-supervised learning, pseudo label, cache mechanism, hybrid data augmentation

朱思远, 李佳圣, 邹丹平, 何迪, 郁文贤. 基于半监督学习的非结构化道路缺陷检测算法[J]. 计算机工程, 2025, 51(9): 14-24.

ZHU Siyuan, LI Jiasheng, ZOU Danping, HE Di, YU Wenxian. Unstructured Road Defect Detection Algorithm Based on Semi-Supervised Learning[J]. Computer Engineering, 2025, 51(9): 14-24.

https://www.ecice06.com/CN/Y2025/V51/I9/14

图/表 15

图1 MAM算法流程

Fig.1 Procedure of MAM algorithm

图2 labelme工具标注示意图

Fig.2 Schematic diagram of labelme tool annotation

图3 不同标注比例下的算法检测结果比较

Fig.3 Comparison of algorithm detection results under different annotation ratios

图4 两种算法在Defect数据集上的性能表现

Fig.4 Performance of two algorithms on Defect dataset

图5 MAM与全监督学习算法在不同检测器上的性能表现

Fig.5 Performance of MAM and fully supervised learning algorithms on different detectors

图6 MAM算法的Grad-CAM可视化结果

Fig.6 Grad-CAM visualization of MAM algorithm

图7 不同置信度阈值下TP和FN样本的梯度密度比较

Fig.7 Comparison of gradient density between TP and FN samples under different confidence thresholds

参考文献 35

1	ZHENG Q H , ZHAO P H , WANG H J , et al. Fine-grained modulation classification using multi-scale radio transformer with dual-channel representation. IEEE Communications Letters, 2022, 26 (6): 1298- 1302. doi: 10.1109/LCOMM.2022.3145647
2	ZHENG Q H, YANG M Q, TIAN X Y, et al. A full stage data augmentation method in deep convolutional neural network for natural image classification[EB/OL]. [2024-02-15]. https://www.researchgate.net/publication/338542427_A_Full_Stage_Data_Augmentation_Method_in_Deep_Convolutional_Neural_Network_for_Natural_Image_Classification.
3	于海洋, 景鹏, 张文涛, 等. 基于残差与注意力机制的道路裂缝检测U-Net改进模型. 计算机工程, 2023, 49 (6): 265- 273. doi: 10.19678/j.issn.1000-3428.0064952
	YU H Y , JING P , ZHANG W T , et al. Improved U-Net model for road crack detection based on residual and attention mechanism. Computer Engineering, 2023, 49 (6): 265- 273. doi: 10.19678/j.issn.1000-3428.0064952
4	殷君君, 代晓康, 张记华, 等. 极化SAR复杂环境车辆目标检测. 空天防御, 2020, 3 (3): 38- 45.
	YIN J J , DAI X K , ZHANG J H , et al. Polarimetric SAR vehicle detection in complex environment. Air & Space Defense, 2020, 3 (3): 38- 45.
5	张轩铭. 基于视觉的非结构化道路识别综述. 汽车文摘, 2024 (2): 28- 35.
	ZHANG X M . A review on unstructured road recognition based on vision. Automotive Digest, 2024 (2): 28- 35.
6	ARYA D , MAEDA H , GHOSH S K , et al. RDD2020: an annotated image dataset for automatic road damage detection using deep learning. Data in Brief, 2021, 36, 107133. doi: 10.1016/j.dib.2021.107133
7	LEE D H. Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks[C]//Proceedings of Workshop on Challenges in Representation Learning. [S. l. ]: ICML, 2013: 896.
8	刘悦, 张璐, 罗文广, 等. 用于多尺度道路目标检测的优化定位置信度改进算法. 小型微型计算机系统, 2023, 44 (9): 2030- 2037.
	LIU Y , ZHANG L , LUO W G , et al. Improved algorithm of optimized localization confidence for multi-scale road object detection. Journal of Chinese Computer Systems, 2023, 44 (9): 2030- 2037.
9	TARVAINEN A, VALPOLA H. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2021: 1195-1204.
10	LI X M, YU L Q, CHEN H, et al. Semi-supervised skin lesion segmentation via transformation consistent self-ensembling model[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1808.03887v1.
11	LIU Y Y, TIAN Y, CHEN Y H, et al. Perturbed and strict mean teachers for semi-supervised semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2022: 4248-4257.
12	CHEN X K, YUAN Y H, ZENG G, et al. Semi-supervised semantic segmentation with cross pseudo supervision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 2613-2622.
13	DEVRIES T, TAYLOR G W, ASSIRI Y. Improved regularization of convolutional neural networks with cutout[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1708.04552v2.
14	ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1710.09412v2.
15	YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2019: 6022-6031.
16	BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-02-15]. https://arxiv.org/abs/2004.10934v1.
17	JEONG J, LEE S, KIM J, et al. Consistency-based semi-supervised learning for object detection[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2019: 10759-10768.
18	CHEN B H, LI P Y, CHEN X, et al. Dense learning based semi-supervised object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2022: 4805-4814.
19	TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2019: 9627-9636.
20	LIU Y C, MA C Y, HE Z J, et al. Unbiased teacher for semi-supervised object detection[EB/OL]. [2024-02-15]. https://arxiv.org/abs/2102.09480v1.
21	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2017: 2999-3007.
22	CHEN B B, CHEN W J, YANG S C, et al. Label matching semi-supervised object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2022: 14361-14370.
23	ZHOU H Y , GE Z , LIU S T , et al. Dense teacher: dense pseudo-labels for semi-supervised object detection. Berlin, Germany: Springer, 2022.
24	XU B W, CHEN M T, GUAN W L, et al. Efficient teacher: semi-supervised object detection for YOLOv5[EB/OL]. [2024-02-15]. https://arxiv.org/abs/2302.07577v3.
25	SOHN K, ZHANG Z Z, LI C L, et al. A simple semi-supervised learning framework for object detection[EB/OL]. [2024-02-15]. https://arxiv.org/abs/2005.04757v2.
26	ZHOU Q, YU C H, WANG Z B, et al. Instant-Teaching: an end-to-end semi-supervised object detection framework[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 4081-4090.
27	XU M D, ZHANG Z, HU H, et al. End-to-end semi-supervised object detection with Soft Teacher[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2021: 3040-3049.
28	ZHANG J, LIN X, ZHANG W, et al. Semi-DETR: semi-supervised object detection with detection Transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2023: 23809-23818.
29	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1706.03762.
30	REN S Q , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149. doi: 10.1109/TPAMI.2016.2577031
31	CARION N , MASSA F , SYNNAEVE G , et al. End-to-end object detection with Transformers. Berlin, Germany: Springer International Publishing, 2020.
32	KANG B Y, XIE S N, ROHRBACH M, et al. Decoupling representation and classifier for long-tailed recognition[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1910.09217v2.
33	CHEN K, WANG J Q, PANG J M, et al. MMDetection: open MMLab detection toolbox and benchmark[EB/OL]. [2024-02-15]. https://arxiv.org/abs/1906.07155v1.
34	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2017: 618-626.
35	LI B Y, LIU Y, WANG X G. Gradient harmonized single-stage detector[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 8577-8584.

[1]	李书玮, 黄正翔, 胡云, 刘兴, 卢笑, 郭畅, 吴成中, 王耀南. 基于无源领域自适应的低光照显著性目标检测[J]. 计算机工程, 2025, 51(4): 75-84.
[2]	高睿, 安国成, 邹丹平, 裴凌. 基于改进YOLOv5的半监督车辆检测算法[J]. 计算机工程, 2025, 51(3): 300-309.
[3]	张新波, 张雪英, 黄丽霞, 陈桂军. 基于半监督深度自编码网络的分类算法及应用[J]. 计算机工程, 2025, 51(1): 71-80.
[4]	郭敏, 张熙涵, 李阳. 融合注意力的教师互一致性半监督医学图像分割[J]. 计算机工程, 2024, 50(9): 313-323.
[5]	陈姣, 沈艳. 面向缓存机制的移动边缘计算任务卸载研究[J]. 计算机工程, 2024, 50(7): 194-203.
[6]	顾永跟, 高凌轩, 吴小红, 陶杰. 非独立同分布下联邦半监督学习的数据分享研究[J]. 计算机工程, 2024, 50(6): 188-196.
[7]	邵良杉, 赵松泽. 基于多模型融合的不完整数据分数插补算法[J]. 计算机工程, 2023, 49(9): 79-88, 98.
[8]	李军侠, 王星驰, 殷梓, 石德硕. 边缘深度挖掘的弱监督显著性目标检测[J]. 计算机工程, 2023, 49(7): 169-178.
[9]	陈仲磊, 伊鹏, 陈祥, 胡涛. 基于集成学习的系统调用实时异常检测框架[J]. 计算机工程, 2023, 49(6): 162-169,179.
[10]	何悦, 陈广胜, 景维鹏, 徐泽堃. 基于深度多相似性哈希方法的遥感图像检索[J]. 计算机工程, 2023, 49(2): 206-212.
[11]	富坤, 孙明磊, 郝玉涵, 刘赢华. 基于对抗训练的伪标签约束自编码器[J]. 计算机工程, 2023, 49(11): 123-130.
[12]	雷洁, 饶文碧, 杨焱超, 熊盛武. 基于分类不确定性的伪标签目标检测算法[J]. 计算机工程, 2023, 49(1): 49-56.
[13]	佘朝阳, 严馨, 徐广义, 陈玮, 邓忠莹. 融合数据增强与半监督学习的药物不良反应检测[J]. 计算机工程, 2022, 48(6): 314-320.
[14]	胡彬, 王晓军, 张雷. 一种半监督对抗鲁棒模型无关元学习方法[J]. 计算机工程, 2022, 48(12): 112-118.
[15]	高伟, 吴顺. 基于多尺度注意力半监督学习的老照片划痕修复[J]. 计算机工程, 2022, 48(10): 245-251,261.

选择文件类型/文献管理软件名称

选择包含的内容