作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (3): 263-270. doi: 10.19678/j.issn.1000-3428.0063891

• 图形图像处理 • 上一篇    下一篇

基于密集连接与特征增强的语义分割算法

马素刚1,2, 陈期梅1, 侯志强1,2, 杨小宝1,3, 张子贤1   

  1. 1. 西安邮电大学 计算机学院, 西安 710121;
    2. 西安邮电大学 陕西省网络数据分析与智能处理重点实验室, 西安 710121;
    3. 西安邮电大学 西安市大数据与智能计算重点实验室, 西安 710121
  • 收稿日期:2022-02-08 修回日期:2022-04-26 发布日期:2022-05-24
  • 作者简介:马素刚(1982—),男,博士研究生,主研方向为计算机视觉、机器学习;陈期梅,硕士研究生;侯志强,教授;杨小宝,高级工程师;张子贤,硕士研究生。
  • 基金资助:
    国家自然科学基金(62072370);陕西省重点研发计划(2018ZDCXL-GY-04-02)。

Semantic Segmentation Algorithm Based on Dense Connection and Feature Enhancement

MA Sugang1,2, CHEN Qimei1, HOU Zhiqiang1,2, YANG Xiaobao1,3, ZHANG Zixian1   

  1. 1. School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China;
    2. Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an 710121, China;
    3. Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an University of Posts and Telecommunications, Xi'an 710121, China
  • Received:2022-02-08 Revised:2022-04-26 Published:2022-05-24

摘要: 在语义分割算法DeepLabv3+中,由于对主干网络提取的特征信息利用不充分,导致了分割边缘不连续、目标丢失以及分割错误等问题。为此,提出一种基于密集连接和特征增强的语义分割算法。采用共享空洞空间金字塔池化(S-ASPP)模块建立多个空洞卷积之间的联系,增强局部信息之间的语义关联,捕获密集的采样点像素,同时提高对高层特征信息的利用。引入特征金字塔增强模块(FPEM)和特征融合模块(FFM),对主干网络输出的多层特征信息进行处理,增强特征的表达能力,并采用FFM对FPEM输出的不同尺度特征信息进行融合,提高各层特征之间的互补能力,以获得更全面的特征图信息。在此基础上,将S-ASPP和FFM的输出进行拼接和卷积操作,得到最终的分割结果。在PASCAL VOC 2012和Cityscapes数据集上的实验结果表明,该算法的平均交并比分别达到81.13%和73.39%,相较于基准算法DeepLabv3+分别提升了2.3和2.1个百分点,充分利用了骨干网络中的每层特征信息,提升了算法的分割精度,取得了较好的分割效果。

关键词: 语义分割, DeepLabv3+算法, 空洞空间金字塔池化, 特征金字塔增强模块, 特征融合

Abstract: In the semantic segmentation algorithm, DeepLabv3+, problems such as discontinuity, target loss, and segmentation errors exist owing to the insufficient utilization of feature information extracted by the backbone network. A semantic segmentation algorithm based on dense connection and feature enhancement is proposed to address these problems.The proposed algorithm uses the Shared-Atrous Spatial Pyramid Pooling(S-ASPP) module to establish contact between multiple atrous convolutions, enhance the semantic relationship between local information, and capture dense sampling point pixels while improving the utilization of high-level feature information. Next, the Feature Pyramid Enhancement Module(FPEM) and Feature Fusion Module(FFM) are introduced to process the multilayer feature information output by the backbone network to enhance the expression capability of the feature. The FFM is used to fuse the different scale feature information outputs from the FPEM to improve the complementary capacity between the feature layers and obtain additional comprehensive feature information.Finally, the outputs of S-ASPP and FFM are spliced and convolved to obtain the final segmentation results. Extensive experiments conducted on PASCAL VOC 2012 and Cityscapes datasets show that the proposed algorithm achieves mean Intersection over Union(mIoU) values of 81.13% and 73.39%, respectively, which are 2.3 and 2.1 percentage points higher than the benchmark algorithm, DeepLabv3+. The proposed algorithm fully utilizes each feature information layer in the backbone network, enhances the segmentation accuracy of the algorithm, and achieves enhanced segmentation.

Key words: semantic segmentation, DeepLabv3+ algorithm, Atrous Spatial Pyramid Pooling(ASPP), Feature Pyramid Enhancement Module(FPEM), feature fusion

中图分类号: