作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (9): 239-247,253. doi: 10.19678/j.issn.1000-3428.0062846

• 图形图像处理 • 上一篇    下一篇

基于无锚框分割网络改进的实例分割方法

刘腾1,2, 刘宏哲1,2, 李学伟1,2, 徐成1,2   

  1. 1. 北京联合大学 北京市信息服务工程重点实验室, 北京 100101;
    2. 北京联合大学 机器人学院, 北京 100101
  • 收稿日期:2021-09-30 修回日期:2021-11-25 发布日期:2021-11-11
  • 作者简介:刘腾(1994—),男,硕士研究生,主研方向为计算机视觉、深度学习;刘宏哲,教授、博士;李学伟(通信作者),教授、博士生导师;徐成,讲师、博士。
  • 基金资助:
    国家自然科学基金(61871039,62102033,62171042,61906017);北京市教委项目(KM202111417001,KM201911417001);视觉智能协同创新中心项目(CYXC2011);北京联合大学学术研究项目(ZB10202003,ZK40202101,ZK120202104);北京联合大学研究生科研创新项目(YZ2020K001)。

Improved Instance Segmentation Method Based on Anchor-Free Segmentation Network

LIU Teng1,2, LIU Hongzhe1,2, LI Xuewei1,2, XU Cheng1,2   

  1. 1. Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China;
    2. College of Robotics, Beijing Union University, Beijing 100101, China
  • Received:2021-09-30 Revised:2021-11-25 Published:2021-11-11

摘要: 在无人驾驶应用场景中,现有无锚框实例分割方法存在大目标特征覆盖小目标特征、缺少两阶段检测器中的感兴趣区域对齐操作、忽略类别分支对掩膜分支提供的位置和空间信息等问题,导致特征提取不充分且无法准确获取目标区域。提出一种改进的无锚框实例分割方法。结合可变形卷积,设计编码-解码特征提取网络提取高分辨率特征,以增强对小目标特征的提取能力,并采用空洞卷积和合并连接的方式,在不增加计算量的前提下有效融合多种分辨率的特征。在此基础上,将注意力机制引入到类别分支中,同时设计结合空间信息和通道信息的信息增强模块,以提高目标检测能力。实验结果表明,该方法在COCO 2017和Cityscapes数据集上平均精度和平均交并比分别为41.1%和83.3%,相比Mask R-CNN、SOLO、Yolact等方法,能够有效改进实例分割效果并具有较优的鲁棒性。

关键词: 无锚框实例分割, 深度学习, 编码-解码结构, 注意力机制, 空洞卷积

Abstract: In autonomous driving application scenarios, the existing anchor-free instance segmentation methods have problems such as large target features covering small target features, lack of a Region Of Interest (ROI)-Align operation in the two-stage detector, ignoring the position and spatial information provided by the regression branch to the mask branch, resulting in insufficient feature extraction and unable to accurately obtain the target region.An improved method for instance, anchor-free segmentation is proposed.Combined with deformable convolution, a encoder-decoder feature extraction network is designed to extract high-resolution features and enhance the extraction ability of small target features.The dilated convolution and merging connection method is adopted to effectively fuse the features of multiple resolutions without increasing the computation amount.On this basis, the attention mechanism is introduced into the regression branch, and an information enhancement module combining spatial and channel information is designed to improve the ability of target detection.The experimental results show that the Average Precision(AP) and mean Intersection over Union(mIoU) of the proposed method on the COCO 2017 and Cityscapes datasets are 41.1% and 83.3%, respectively.Compared with Mask R-CNN, SOLO, Yolact, and other methods, the proposed method can effectively improve the effect of instance segmentation and has better robustness.

Key words: anchor-free instance segmentation, Deep Learning(DL), encoder-decoder structure, attention mechanism, dilated convolution

中图分类号: