作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (9): 304-312. doi: 10.19678/j.issn.1000-3428.0069039

• 图形图像处理 • 上一篇    下一篇

基于多任务学习的超分辨率辅助小目标检测

张天鹏*(), 韩晶, 吕学强   

  1. 北京信息科技大学网络文化与数字传播北京市重点实验室, 北京 100101
  • 收稿日期:2023-12-18 出版日期:2024-09-15 发布日期:2024-09-19
  • 通讯作者: 张天鹏
  • 基金资助:
    国家自然科学基金(62171043); 北京市自然科学基金(4232025); 北京市教委科研计划科技一般项目(KM202311232003)

Super-Resolution-Aided Small-Target Detection Based on Multi-Task Learning

ZHANG Tianpeng*(), HAN Jing, LÜ Xueqiang   

  1. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
  • Received:2023-12-18 Online:2024-09-15 Published:2024-09-19
  • Contact: ZHANG Tianpeng

摘要:

小目标通常具有低分辨率和模糊不清的特点, 并容易受到遮挡和背景的影响, 导致难以实现准确且实时的小目标检测。为提升检测效果, 提出一种基于多任务学习的超分辨率辅助小目标检测算法Multi-YOLO。首先, 引入一个超分辨率辅助分支引导主干网络提取有效特征, 减少小目标信息丢失; 其次, 采用Anchor based协同监督Anchor free的双检测头训练方法来辅助提升检测准确性, 另外, 在骨干网络尾部使用CTR3模块加强目标信息与位置感知的关联性; 最后, 在推理阶段仅使用检测分支进行推理以保证推理速度。实验结果表明, Multi-YOLO相对于基准网络在VEDAI、COCO MiniTrain和SPCD数据集上均取得了一定的性能提升, 其中在VEDAI数据集上, Multi-YOLO实现了10.9%的平均精度均值(mAP)提升, 且与基准模型大小相近。同时, 与主流的单阶段目标检测网络相比, Multi-YOLO在小目标检测方面表现出色, 并在精度和速度之间取得了平衡。

关键词: 深度学习, 小目标检测, 多任务学习, 超分辨率, 注意力机制

Abstract:

Small targets often exhibit low resolution and blurriness and are easily affected by occlusions and background interference, making accurate and real-time detection of small targets challenging. In this study, to enhance the detection performance, a super-resolution-aided small-target detection algorithm based on multi-task learning called Multi-YOLO is proposed. First, a super-resolution auxiliary branch is introduced to guide the main network in extracting effective features, thereby reducing the loss of information for small targets. Second, a collaborative supervision method is employed by combining Anchor based and Anchor free detection heads to improve the detection accuracy. Additionally, a CTR3 module is used at the end of the backbone network to strengthen the correlation between the target information and position awareness. Finally, during the inference stage, only the detection branch is used to maintain the speed of inference. Experimental results show that, compared with the baseline network, Multi-YOLO achieves performance improvement on the VEDAI, COCO MiniTrain, and SPCD datasets. Specifically, on the VEDAI dataset, this method achieves a 10.9% improvement in mean Average Precision (mAP) improvement while maintaining a model size similar to that of the baseline model. Moreover, compared with mainstream single-stage object detection networks, Multi-YOLO excels in small-target detection, maintaining a remarkable balance between accuracy and speed.

Key words: deep learning, small-target detection, multi-task learning, super resolution, attention mechanism