Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (7): 264-269,299. doi: 10.19678/j.issn.1000-3428.0061872

• Graphics and Image Processing • Previous Articles     Next Articles

Object Detection Method Based on Feature Channel Modeling

ZHANG Yexing1,2, CHEN Min1,2, PAN Qiuyu1,2   

  1. 1. Power China Huadong Engineering Corporation, Hangzhou 310000, China;
    2. Zhejiang Huadong Engineering Digital Technology Co., Ltd., Hangzhou 310000, China
  • Received:2021-06-08 Revised:2021-08-31 Online:2022-07-15 Published:2021-09-09

基于特征通道建模的目标检测方法

张业星1,2, 陈敏1,2, 潘秋羽1,2   

  1. 1. 中国电建集团华东勘测设计研究院有限公司, 杭州 310000;
    2. 浙江华东工程数字技术有限公司, 杭州 310000
  • 作者简介:张业星(1982—),男,高级工程师,主研方向为建筑工程数字化;陈敏,高级工程师;潘秋羽,硕士。
  • 基金资助:
    国家自然科学基金(61972225)。

Abstract: Addressing the problem that using multiscale fusion feature maps to predict objects without channel modeling results in poor robustness.A detection method based on the multidimensional modeling of global image information is proposed.In this method, multistage feature reuse and feature fusion are used to reduce the loss of correlation within features.Breadth Channel Modeling Branch(BCMB) and Depth Channel Modeling Branch(DCMB) are designed to compensate for the lack of spatial information caused by the change in the receptive field and enrich the contextual information among the objects.The former established a two-dimensional channel matrix in the width and height directions, which can model the multilevel receptive field, thus enriching the spatial perceptual ability of the model and facilitating localization.The latter establishes a one-dimensional channel vector in the depth direction, which can extract the global features of an image, thus enriching the contextual description of the model and helping classification.According to the experimental results, by weighted fusion of the channel map obtained by the two branches with the input feature, the output feature can be sensitive to the position and category information of the object, which makes the mAP value reach 85.8% in the PASCAL VOC 2007 test data set.Compared with the Baseline method without channel modeling, the proposed method improved the mAP value by a maximum of 3.2 percentage points.

Key words: object detection, contextual information, channel modeling, Convolutional Neural Networks(CNN), feature enhancement

摘要: 针对直接利用多尺度融合特征图进行目标检测时鲁棒性较差的问题,提出一种对图像全局信息进行多维建模的检测方法。采用多阶段的特征复用和特征融合减少特征间相关性损失,设计广度通道建模分支(BCMB)与深度通道建模分支(DCMB)弥补因感受野变化造成的图像空间信息不足,并丰富图像中各个目标间的上下文信息。通过BCMB建立宽高方向的二维通道矩阵,对多层级的感受野进行建模,进而丰富模型对图像的空间感知,完成目标定位。使用DCMB建立深度方向的一维通道向量,提炼图像的全局特征,丰富模型对图像的上下文描述,完成目标分类。将2个分支生成的通道图与输入特征进行加权融合,增强图像通道表达力,使输出的特征对目标的位置和类别信息更敏感。在PASCAL VOC 2007测试数据集上的实验结果表明,该方法的mAP值为85.8%,与未使用通道建模的Baseline方法相比,最高可提升3.2个百分点。

关键词: 目标检测, 上下文信息, 通道建模, 卷积神经网络, 特征增强

CLC Number: