Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2023, Vol. 49 ›› Issue (3): 248-256. doi: 10.19678/j.issn.1000-3428.0064100

• Graphics and Image Processing • Previous Articles     Next Articles

Semantic Segmentation Method for Outdoor Point Clouds Based on Contextual Attention

SU Mingfang, HU Likun, HUANG Runhui   

  1. School of Electrical Engineering, Guangxi University, Nanning 530004, China
  • Received:2021-12-30 Revised:2022-03-16 Published:2022-05-25

基于上下文注意力的室外点云语义分割方法

苏鸣方, 胡立坤, 黄润辉   

  1. 广西大学 电气工程学院, 南宁 530004
  • 作者简介:苏鸣方(1997—),男,硕士研究生,主研方向为三维环境感知;胡立坤(通信作者),教授、博士;黄润辉,硕士研究生。
  • 基金资助:
    国家自然科学基金“基于语义地图与稳定性评估的六足仿生机器人步态规划和协调控制”(61863002);广西研究生教育创新计划“足式机器人的地形识别与语义地图构建”(YCSW2020003)。

Abstract: The semantic segmentation method based on direct points can avoid the information loss caused by the structured processing of point clouds;however, it does not fully utilize multi-scale contextual features, resulting in a reduction in the segmentation accuracy for small objects, such as pedestrians and bicycles.A semantic segmentation method for point clouds based on contextual attention is proposed;it consists of Bidirectional Contextual Attention Fusion(BCAF) and a Contextual Encoding-Channel Self-Attention(CE-CSA) module.The forward attention channel is used to fuse the adjacent scale features to obtain shallow fine-grained information, and the reverse attention channel further fuses high-level semantic information to enhance the contextual awareness of the model.The CE-CSA module is designed to capture global contextual information.By encoding multi-scale features and assigning different weights to feature channels, the network pays more attention to specific channel features to reduce feature redundancy.Experiments on two large-scale outdoor point clouds datasets, namely, SemanticKITTI and Semantic3D, show that the mean Intersection over Union of the proposed method are 55.0% and 76.4%, respectively.On the SemanticKITTI dataset, compared with the benchmark method RandLA-Net, the pedestrian and bicycle Intersection over Union of the proposed method are increased by 3.0 and 6.9 percentage points respectively, which can effectively capture multi-scale contextual information and improve the segmentation accuracy for small objects.

Key words: semantic segmentation, contextual attention, outdoor point clouds, multi-scale feature, Channel Self-Attention(CSA), point cloud small targets

摘要: 基于直接点的语义分割方法能够避免因点云结构化处理所造成的信息损失,但未充分利用多尺度上下文特征,导致行人、自行车等小目标的分割精度降低。提出一种基于上下文注意力的点云语义分割方法,其由双向上下文注意力融合和上下文编码-通道自注意力模块组成。通过前向注意力通道进行邻近尺度特征融合,从而获得更多的浅层细粒度信息,而反向注意力通道进一步融合高层语义信息,以增强模型的上下文感知能力。为捕获全局上下文信息,设计上下文编码-通道自注意力模块,通过对多尺度特征进行编码,并为特征通道分配不同的权重,使网络更关注特定的通道特征,以减少特征的冗余。在SemanticKITTI和Semantic3D大规模室外点云数据集上的实验结果表明,该方法的平均交并比分别为55.0%和76.4%,其中在SemanticKITTI数据集上,相比基准方法RandLA-Net的行人和自行车交并比分别提高3.0和6.9个百分点,能有效捕获多尺度上下文信息,提高小目标的分割精度。

关键词: 语义分割, 上下文注意力, 室外点云, 多尺度特征, 通道自注意力, 点云小目标

CLC Number: