作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 310-320. doi: 10.19678/j.issn.1000-3428.0065481

• 开发研究与工程应用 • 上一篇    

基于条形池化和注意力机制的街道场景红外目标检测算法

李强龙1, 周新文1,*, 位梦恩1, 甘阳洲2   

  1. 1. 常州大学 计算机与人工智能学院, 江苏 常州 213164
    2. 中国科学院深圳先进技术研究院, 广东 深圳 518055
  • 收稿日期:2022-08-10 出版日期:2023-08-15 发布日期:2023-08-15
  • 通讯作者: 周新文
  • 作者简介:

    李强龙(1994—),男,硕士研究生,主研方向为机器视觉、目标检测

    位梦恩,硕士研究生

    甘阳洲,副研究员、博士

  • 基金资助:
    广东省基础与应用基础研究基金(2020A1515010651)

Infrared Target Detection Algorithm Based on Strip Pooling and Attention Mechanism in Street Scene

Qianglong LI1, Xinwen ZHOU1,*, Meng'en WEI1, Yangzhou GAN2   

  1. 1. School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213164, Jiangsu, China
    2. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
  • Received:2022-08-10 Online:2023-08-15 Published:2023-08-15
  • Contact: Xinwen ZHOU

摘要:

街道场景下的红外图像所含细节信息少、背景复杂,目前的目标检测模型存在检测精度低、检测速度慢的问题。为此,基于条形池化和注意力机制提出一种新的红外目标检测算法。使用包含条形池化和金字塔池化模块的混合池化模块改进快速空间池化金字塔模块,利用条形池化解决传统池化操作在进行目标检测时存在的特征丢失和污染问题,提高算法对长窄目标的特征提取能力,同时在孤立目标之间建立全局依赖关系,使模型收集更多的特征信息。在注意力模块中加入水平和垂直方向上的全局池化操作,以获取目标在特征图全局范围上的位置信息,将位置信息嵌入特征通道中,使算法更精准地定位目标,降低复杂背景对检测性能的影响。使用无批次归一化阻断批次归一化的估计偏移累积,解决算法性能退化问题,进一步提高算法的检测性能。在FLIR数据集上的实验结果表明,该算法的mAP(IoU值为0.5)和F1值分别达到80.7%和78.0%,相较YOLOv5分别提高了1.9和2.4个百分点。

关键词: 红外目标检测, 条形池化, 金字塔池化, 注意力机制, 无批次归一化

Abstract:

Infrared image in street scene contains less detail information and complex background, the existing target detection model exhibits low accuracy and sluggish processing speed. To address these issues, a new infrared target detection algorithm based on strip pooling and attention mechanism is proposed.The Mixed Pooling Module(MPM) includes strip pooling and the Pyramid Pooling Module(PPM) is used to improve the Spatial Pyramid Pooling Fast (SPPF) module. Strip pooling is applied to solve the feature loss and pollution issues existing in the traditional pooling operation during target detection, so as to improve the feature extraction ability for long and narrow targets, and the global dependency relationship is established between isolated targets, whereby this new method helps the model capture more enriched feature information. The global pooling operates in the horizontal direction, and vertical directions are handled by the attention module to obtain the position information of the target in the global range of the feature map, whereby the position information is embedded into the feature channel so that the algorithm can locate the target more accurately and reduce the impact of complex backgrounds on detection performance.Batch-Free Normalization(BFN) is used to address the performance degradation caused by the accumulation of the estimated offset in Batch Normalization(BN), which further improves the detection performance of the algorithm.The experimental results on FLIR dataset show that the improved algorithm has an mAP(IoU value is 0.5) of 80.7% and an F1 value of 78.0%, which are 1.9 and 2.4 percentage points higher than those of YOLOv5, respectively.

Key words: infrared target detection, strip pooling, pyramid pooling, attention mechanism, Batch-Free Normalization(BFN)