Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2021, Vol. 47 ›› Issue (7): 21-29. doi: 10.19678/j.issn.1000-3428.0059577

• Research Hotspots and Reviews • Previous Articles     Next Articles

Real-Time Scene Segmentation Algorithm for Indoor Service Robot

LIN Jie, CHEN Chunmei, LIU Guihua, ZHU Lijia   

  1. School of Information Engineering, Southwest University of Science and Technology, Mianyang, Sichuan 621000, China
  • Received:2020-09-25 Revised:2020-11-13 Published:2020-12-01

室内服务机器人的实时场景分割算法

林杰, 陈春梅, 刘桂华, 祝礼佳   

  1. 西南科技大学 信息工程学院, 四川 绵阳 621000
  • 作者简介:林杰(1996-),男,硕士研究生,主研方向为图像处理、模式识别、深度学习;陈春梅(通信作者),副教授、博士;刘桂华,教授、博士;祝礼佳,硕士研究生。
  • 基金资助:
    国防科工局核能开发科研项目“核应急处置机器人关键技术研究”(17zg610205);四川省科技厅重点研发项目“基于二维与三维视觉图像融合技术的车底异物检测系统”(19ZS2117)。

Abstract: Real-time scene segmentation in indoor scenes is a key technology required for the development of indoor service robots.Some great advances have been made in the studies of semantic segmentation,but most existing methods tend to use complex network structures or models that improve the accuracy at the price of higher computational cost and deployment cost.To address the limited computational cost of mobile robots,the design of a lightweight bottleneck structure is described,and on this basis a lightweight scene segmentation network is constructed.The network cascades with the feature extraction network to obtain deeper semantic features,and integrates shallow features with deep semantic features to obtain richer image features.Then the network combines depthwise separable convolution and multi-scale dilated convolution to extract multi-scale image features,and reduces the number of parameters and amount of calculation of the model.At the same time,the channel attention mechanism is introduced to improve the accuracy of network segmentation.Experiments are carried out taking 512×512 pixels image as the input,and results show that the MIoU of the proposed algorithm reaches 72.7% on the NYUDv2 indoor scene segmentation dataset and 59.9% on the CamVid dataset,while the amount of calculation cost is only 4.2 GFLOPs and the number of parameters is 8.3 Mb.The algorithm can be deployed on the NVIDIA Jetson XavierNX embedded platform for mobile robots,and achieved 42 frame/s in inference speed,significantly outperforming DeepLabV3+,PSPNet,SegNet and UNet algorithms in real-time performance.

Key words: lightweight network, scene segmentation, depthwise separable convolution, dilated convolution, attention mechanism

摘要: 室内场景下的实时场景分割是开发室内服务机器人的一项关键技术,目前关于语义分割的研究已经取得了重大进展,但是多数方法都倾向于设计复杂的网络结构或者高计算成本的模型来提高精度指标,而忽略了实际的部署成本。针对移动机器人算力成本有限的问题,设计一种轻量化的瓶颈结构,并以此为基本元素构建轻量化场景分割网络。该网络通过与特征提取网络级联获得更深层次的语义特征,并且融合浅层特征与深层语义特征获得更丰富的图像特征,其结合深度可分离卷积与多尺度膨胀卷积提取多尺度图像特征,减少了模型的参数量与计算量,同时利用通道注意力机制提升特征加权时的网络分割精度。以512像素×512像素的图像作为输入进行实验,结果表明,该算法在NYUDv2室内场景分割数据集和CamVid数据集上的MIoU分别达到72.7%和59.9%,模型计算力为4.2 GFLOPs,但参数量仅为8.3 Mb,在移动机器人NVIDIA Jetson XavierNX嵌入式平台帧率可达到42 frame/s,其实时性优于DeepLabV3+、PSPNet、SegNet和UNet算法。

关键词: 轻量化网络, 场景分割, 深度可分离卷积, 膨胀卷积, 注意力机制

CLC Number: