作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (1): 49-56. doi: 10.19678/j.issn.1000-3428.0063678

• 人工智能与模式识别 • 上一篇    下一篇

基于分类不确定性的伪标签目标检测算法

雷洁1, 饶文碧1,2, 杨焱超1, 熊盛武1,2   

  1. 1. 武汉理工大学 计算机与人工智能学院, 武汉 430070;
    2. 武汉理工大学 三亚科教创新园, 海南 三亚 572000
  • 收稿日期:2021-12-31 修回日期:2022-02-14 发布日期:2023-01-06
  • 作者简介:雷洁(1997-),女,硕士研究生,主研方向为小样本目标检测;饶文碧,教授、博士;杨焱超,实验师、硕士;熊盛武,教授、博士。
  • 基金资助:
    国家自然科学基金(62176194);湖北省科技创新计划项目(2020AAA001);武汉理工大学三亚科教创新园项目(2021KF0031)。

Pseudo-Label Object Detection Algorithm Based on Classification Uncertainty

LEI Jie1, RAO Wenbi1,2, YANG Yanchao1, XIONG Shengwu1,2   

  1. 1. School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China;
    2. Sanya Science and Education Innovation Park, Wuhan University of Technology, Sanya, Hainan 572000, China
  • Received:2021-12-31 Revised:2022-02-14 Published:2023-01-06

摘要: 伪标签目标检测算法利用大量未标注数据生成伪标签数据来增加训练数据规模,从而提高目标检测模型的性能。针对伪标签数据中存在大量错误标注数据且伪标签目标检测模型性能难以提升的问题,提出基于SoftTeacher-CUC的伪标签目标检测算法。SoftTeacher-CUC算法在SoftTeacher伪标签目标检测算法的基础上,利用分类不确定性方法计算模型生成的伪标签分类结果的不确定性来判断伪标签是否可靠,不确定性越低说明伪标签的分类结果越可靠。在此基础上,将计算得到的不确定性作为权重加入伪标签数据的分类损失函数中,进一步减少高不确定性伪标签为模型带来的负面影响。根据Teacher模型中不同模块的作用,采用不同权重的指数滑动平均方法更新Teacher模型,降低Teacher模型和Student模型参数之间的相似性,使一致性正则化方法发挥效用。实验结果表明,在标注数据分别占训练集1%、5%和10%的情况下,与SoftTeacher算法相比,SoftTeacher-CUC算法的平均精度均值分别提高了1.4、1.2和1.7个百分点,在标注数据较少的情况下,该算法具有更好的检测效果。

关键词: 目标检测, 伪标签, 分类不确定性, 指数滑动平均, 分类损失函数, 一致性正则化

Abstract: The pseudo-label object detection algorithm aims to increase the training data size by using a large amount of unlabeled data to generate pseudo-label data in order to improve the performance of the object detection model.To mitigate the problem of a large amount of incorrectly labeled data in the pseudo-label data and that the performance of the pseudo-label object detection model is difficult to improve, this paper proposes a pseudo-label object detection algorithm of SoftTeacher-CUC.The SoftTeacher-CUC algorithm is based on the SoftTeacher pseudo-label object detection algorithm.First, the classification uncertainty calculation method calculates the uncertainty of the pseudo-label classification results generated by the model to determine whether the pseudo-label is reliable.The lower the uncertainty, the more reliable the pseudo-label classification results.Then, the calculated uncertainty is added as a weight to the classification loss of the pseudo-label data to further reduce the negative impact of high-uncertainty pseudo-labels on the model.Finally, according to the role of the different modules of the Teacher model, an Exponential Moving Average(EMA) method with different weights is used to update the Teacher model.It reduces the similarity between the parameters of the Teacher model and those of the Student model to enable the consistency regularization method to work.The experimental results show that the labeled data accounts for 1%, 5%, and 10% of the training set.Compared with the SoftTeacher algorithm, the mAP of the SoftTeacher-CUC algorithm is improved by 1.4, 1.2, and 1.7 percentage points, respectively.Consequently, SoftTeacher-CUC algorithm has a better detection effect when there is less labeled data.

Key words: object detection, pseudo-label, classification uncertainty, Exponential Moving Average(EMA), classification loss function, consistency regularization

中图分类号: