基于上下文注意力的室外点云语义分割方法

doi:10.19678/j.issn.1000-3428.0064100

摘要/Abstract

摘要： 基于直接点的语义分割方法能够避免因点云结构化处理所造成的信息损失，但未充分利用多尺度上下文特征，导致行人、自行车等小目标的分割精度降低。提出一种基于上下文注意力的点云语义分割方法，其由双向上下文注意力融合和上下文编码-通道自注意力模块组成。通过前向注意力通道进行邻近尺度特征融合，从而获得更多的浅层细粒度信息，而反向注意力通道进一步融合高层语义信息，以增强模型的上下文感知能力。为捕获全局上下文信息，设计上下文编码-通道自注意力模块，通过对多尺度特征进行编码，并为特征通道分配不同的权重，使网络更关注特定的通道特征，以减少特征的冗余。在SemanticKITTI和Semantic3D大规模室外点云数据集上的实验结果表明，该方法的平均交并比分别为55.0%和76.4%，其中在SemanticKITTI数据集上，相比基准方法RandLA-Net的行人和自行车交并比分别提高3.0和6.9个百分点，能有效捕获多尺度上下文信息，提高小目标的分割精度。

关键词: 语义分割, 上下文注意力, 室外点云, 多尺度特征, 通道自注意力, 点云小目标

Abstract: The semantic segmentation method based on direct points can avoid the information loss caused by the structured processing of point clouds;however, it does not fully utilize multi-scale contextual features, resulting in a reduction in the segmentation accuracy for small objects, such as pedestrians and bicycles.A semantic segmentation method for point clouds based on contextual attention is proposed;it consists of Bidirectional Contextual Attention Fusion(BCAF) and a Contextual Encoding-Channel Self-Attention(CE-CSA) module.The forward attention channel is used to fuse the adjacent scale features to obtain shallow fine-grained information, and the reverse attention channel further fuses high-level semantic information to enhance the contextual awareness of the model.The CE-CSA module is designed to capture global contextual information.By encoding multi-scale features and assigning different weights to feature channels, the network pays more attention to specific channel features to reduce feature redundancy.Experiments on two large-scale outdoor point clouds datasets, namely, SemanticKITTI and Semantic3D, show that the mean Intersection over Union of the proposed method are 55.0% and 76.4%, respectively.On the SemanticKITTI dataset, compared with the benchmark method RandLA-Net, the pedestrian and bicycle Intersection over Union of the proposed method are increased by 3.0 and 6.9 percentage points respectively, which can effectively capture multi-scale contextual information and improve the segmentation accuracy for small objects.

Key words: semantic segmentation, contextual attention, outdoor point clouds, multi-scale feature, Channel Self-Attention(CSA), point cloud small targets

中图分类号:

TP391

苏鸣方, 胡立坤, 黄润辉. 基于上下文注意力的室外点云语义分割方法[J]. 计算机工程, 2023, 49(3): 248-256.

SU Mingfang, HU Likun, HUANG Runhui. Semantic Segmentation Method for Outdoor Point Clouds Based on Contextual Attention[J]. Computer Engineering, 2023, 49(3): 248-256.

http://www.ecice06.com/CN/Y2023/V49/I3/248

图/表 11

20230314190314

20230314190317

20230314190320

20230314190323

20230314190327

20230314190330

20230314190333

20230314190336

20230314190339

20230314190343

20230314190346

参考文献

[1] ZHANG Y, ZHOU Z X, DAVID P, et al.PolarNet:an improved grid representation for online LiDAR point clouds semantic segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:9598-9607.
[2] KIM M, ILYAS N, KIM K.AMSASeg:an attention-based multi-scale atrous convolutional neural network for real-time object segmentation from 3D point cloud[J].IEEE Access, 2021, 9:70789-70796.
[3] CHENG R, RAZANI R, TAGHAVI E, et al.(AF)2-S3Net:attentive feature fusion with adaptive feature selection for sparse semantic segmentation network[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:12542-12551.
[4] ZHU X G, ZHOU H, WANG T, et al.Cylindrical and asymmetrical 3D convolution networks for LiDAR-based perception[C]//Proceedings of Conference on Transactions on Pattern Analysis and Machine Intelligence.Washington D.C., USA:IEEE Press, 2021:6807-6822.
[5] 杨晓文, 李静, 韩燮, 等.基于八叉树的卷积神经网络三维模型分割[J].计算机工程与设计, 2020, 41(9):2663-2669. YANG X W, LI J, HAN X, et al.Octree-based convolutional neural networks for 3D model segmentation[J].Computer Engineering and Design, 2020, 41(9):2663-2669.(in Chinese)
[6] CHARLES R Q, HAO S, MO K C, et al.PointNet:deep learning on point sets for 3D classification and segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:77-85.
[7] QI C R, YI L, SU H, et al.PointNet++:deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:5105-5114.
[8] THOMAS H, QI C R, DESCHAUD J E, et al.KPConv:flexible and deformable convolution for point clouds[C]//Proceedings of International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:6410-6419.
[9] 杨军, 党吉圣.基于上下文注意力CNN的三维点云语义分割[J].通信学报, 2020, 41(7):195-203. YANG J, DANG J S.Semantic segmentation of 3D point cloud based on contextual attention CNN[J].Journal of Communications, 2020, 41(7):195-203.(in Chinese)
[10] 白静, 徐浩钧.MSP-Net:多尺度点云分类网络[J].计算机辅助设计与图形学学报, 2019, 31(11):1917-1924. BAI J, XU H J.MSP-Net:multi-scale point cloud classification network[J].Journal of Computer-Aided Design & Computer Graphics, 2019, 31(11):1917-1924.(in Chinese)
[11] 田钰杰, 管有庆, 龚锐.一种鲁棒的多特征点云分类分割深度神经网络[J].计算机工程, 2021, 47(11):234-240. TIAN Y J, GUAN Y Q, GONG R.A robust deep neural network for multi-feature point cloud classification and segmentation[J].Computer Engineering, 2021, 47(11):234-240.(in Chinese)
[12] 于丽丽, 于海洋, 何子鑫, 等.基于双注意力机制和多尺度特征的点云场景分割[J].激光与光电子学进展, 2021, 58(24):471-479. YU L L, YU H Y, HE Z X, et al.Point cloud scene segmentation based on dual attention mechanism and multi-scale features[J].Laser & Optoelectronics Progress, 2021, 58(24):471-479.(in Chinese)
[13] LIN H J, LUO Z P, LI W, et al.Adaptive pyramid context fusion for point cloud perception[J].IEEE Geoscience and Remote Sensing Letters, 2022, 19:1-5.
[14] LANDRIEU L, SIMONOVSKY M.Large-scale point cloud semantic segmentation with superpoint graphs[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:4558-4567.
[15] HU Q Y, YANG B, XIE L H, et al.RandLA-Net:efficient semantic segmentation of large-scale point clouds[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:11105-11114.
[16] SHUAI H, XU X, LIU Q S.Backward attentive fusing network with local aggregation classifier for 3D point cloud semantic segmentation[J].IEEE Transactions on Image Processing, 2021, 30:4973-4984.
[17] LIU H, GUO Y L, MA Y N, et al.Semantic context encoding for accurate 3D point cloud segmentation[J].IEEE Transactions on Multimedia, 2021, 23:2045-2055.
[18] DENG S, DONG Q L.GA-NET:global attention network for point cloud semantic segmentation[J].IEEE Signal Processing Letters, 2021, 28:1300-1304.
[19] GUO M H, CAI J X, LIU Z N, et al.PCT:point cloud Transformer[J].Computational Visual Media, 2021, 7(2):187-199.
[20] BEHLEY J, GARBADE M, MILIOTO A, et al.SemanticKITTI:a dataset for semantic scene understanding of LiDAR sequences[C]//Proceedings of International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:9296-9306.
[21] HACKEL T, SAVINOV N, LADICKY L, et al.Semantic3D net:a new large-scale point cloud classification benchmark[EB/OL].[2021-11-25].https://arxiv.org/pdf/1704.03847.pdf.
[22] SU H, JAMPANI V, SUN D Q, et al.SPLATNet:sparse lattice networks for point cloud processing[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:2530-2539.
[23] TATARCHENKO M, PARK J, KOLTUN V, et al.Tangent convolutions for dense prediction in 3D[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:3887-3896.
[24] WU B C, WAN A, YUE X Y, et al.SqueezeSeg:convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud[C]//Proceedings of International Conference on Robotics and Automation.Washington D.C., USA:IEEE Press, 2018:1887-1893.
[25] WU B C, ZHOU X Y, ZHAO S C, et al.SqueezeSegV2:improved model structure and unsupervised domain adaptation for road-object segmentation from a LiDAR point cloud[C]//Proceedings of International Conference on Robotics and Automation.Washington D.C., USA:IEEE Press, 2019:4376-4382.
[26] MILIOTO A, VIZZO I, BEHLEY J, et al.RangeNet++:fast and accurate LiDAR semantic segmentation[C]//Proceedings of International Conference on Intelligent Robots and Systems.Washington D.C., USA:IEEE Press, 2019:4213-4220.
[27] TCHAPMI L, CHOY C, ARMENI I, et al.SEGCloud:semantic segmentation of 3D point clouds[C]//Proceedings of International Conference on 3D Vision.Washington D.C., USA:IEEE Press, 2017:537-547.
[28] THOMAS H, GOULETTE F, DESCHAUD J E, et al.Semantic classification of 3D point clouds with multiscale spherical neighborhoods[C]//Proceedings of International Conference on 3D Vision.Washington D.C., USA:IEEE Press, 2018:390-398.
[29] ZHANG Z Y, HUA B S, YEUNG S K.ShellNet:efficient point cloud convolutional neural networks using concentric shells statistics[C]//Proceedings of International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:1607-1616.
[30] WANG L, HUANG Y C, HOU Y L, et al.Graph attention convolution for point cloud semantic segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:10288-10297.

选择文件类型/文献管理软件名称

选择包含的内容