动态场景下基于语义分割的视觉SLAM方法

doi:10.19678/j.issn.1000-3428.0067370

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 242-249. doi: 10.19678/j.issn.1000-3428.0067370

动态场景下基于语义分割的视觉SLAM方法

杜晓英¹, 袁庆霓¹^,²^,³^,*(), 齐建友¹, 王晨¹, 杜飞龙¹, 任澳¹

1. 贵州大学现代制造技术教育部重点实验室, 贵州贵阳 550025
2. 贵州大学机械工程学院, 贵州贵阳 550025
3. 贵州大学省部共建公共大数据国家重点实验室, 贵州贵阳 550025

收稿日期:2023-04-06 出版日期:2024-03-15 发布日期:2023-06-16
通讯作者: 袁庆霓
基金资助:
国家自然科学基金(52165063); 国家自然科学基金(52065010); 贵州省科技厅资助项目([2022]重点024); 贵州省科技厅资助项目([2022]一般140); 贵州省科技厅资助项目([2023]一般094); 贵州省科技厅资助项目([2023]一般025); 贵州大学实验室开放资助项目(SYSKF2023-089)

Visual SLAM Method Based on Semantic Segmentation in Dynamic Scenes

Xiaoying DU¹, Qingni YUAN¹^,²^,³^,*(), Jianyou QI¹, Chen WANG¹, Feilong DU¹, Ao REN¹

1. Key Laboratory of Advanced Manufacturing Technology, Ministry of Education, Guizhou University, Guiyang 550025, Guizhou, China
2. School of Mechanical Engineering, Guizhou University, Guiyang 550025, Guizhou, China
3. State Key Laboratory of Public Big Data Jointly Built by Provincial and Ministerial Governments, Guizhou University, Guiyang 550025, Guizhou, China

Received:2023-04-06 Online:2024-03-15 Published:2023-06-16
Contact: Qingni YUAN

摘要/Abstract

摘要：

针对在动态场景下视觉同步定位与建图（SLAM）鲁棒性差、定位与建图精度易受动态物体干扰的问题，设计一种基于改进DeepLabv3plus与多视图几何的语义视觉SLAM算法。以语义分割网络DeepLabv3plus为基础，采用轻量级卷积网络MobileNetV2进行特征提取，并使用深度可分离卷积代替空洞空间金字塔池化模块中的标准卷积，同时引入注意力机制，提出改进的语义分割网络DeepLabv3plus。将改进后的语义分割网络DeepLabv3plus与多视图几何结合，提出动态点检测方法，以提高视觉SLAM在动态场景下的鲁棒性。在此基础上，构建包含语义信息和几何信息的三维语义静态地图。在TUM数据集上的实验结果表明，与ORB-SLAM2相比，该算法在高动态序列下的绝对轨迹误差的均方根误差值和标准差（SD）值最高分别提升98%和97%。

关键词: DeepLabv3plus网络, 视觉同步定位与建图, 多视图几何, 动态场景, 语义地图

Abstract:

A semantic visual SLAM algorithm based on an improved semantic segmentation network DeepLabv3plus and multiview geometry is designed to address the issues of poor robustness and susceptibility to interference from dynamic objects in visual Synchronous Localization And Map (SLAM) construction in dynamic scenes. Based on the semantic segmentation network DeepLabv3plus, a lightweight convolutional network MobileNetV2 is used for feature extraction, and depthwise separable convolutions are used instead of standard convolutions in the Atrous Spatial Pyramid Pooling (ASPP) module. Simultaneously, an attention mechanism is introduced to propose an improved semantic segmentation network DeepLabv3plus. Combining the improved semantic segmentation network DeepLabv3plus with multiview geometry, a dynamic point detection method is proposed to enhance the robustness of visual SLAM in dynamic scenes. On this basis, a three-dimensional semantic static map containing both semantic and geometric information is constructed. The experimental results on the TUM dataset demonstrate that compared with ORB-SLAM2, the highest Root Mean Square Error (RMSE) and Standard Deviation (SD) values increased by more than 98% and 97%, respectively.

Key words: DeepLabv3plus network, visual Synchronous Localization And Map (SLAM), multiview geometry, dynamic scenes, semantic map

杜晓英, 袁庆霓, 齐建友, 王晨, 杜飞龙, 任澳. 动态场景下基于语义分割的视觉SLAM方法[J]. 计算机工程, 2024, 50(3): 242-249.

Xiaoying DU, Qingni YUAN, Jianyou QI, Chen WANG, Feilong DU, Ao REN. Visual SLAM Method Based on Semantic Segmentation in Dynamic Scenes[J]. Computer Engineering, 2024, 50(3): 242-249.

https://www.ecice06.com/CN/Y2024/V50/I3/242

图/表 16

图1 视觉SLAM总体框架

Fig.1 Overall framework of visual SLAM

图2 改进的DeepLabv3plus网络结构

Fig.2 Structure of improved DeepLabv3plus network

图3 多视图几何原理示意图

Fig.3 Schematic diagram of multiview geometry principles

图4 改进前后DeepLabv3plus的分割结果对比

Fig.4 Comparison of segmentation results of DeepLabv3plus before and after improvement

图5 不同算法在fr3/w/rpy序列下的x、y、z轴位移对比

Fig.5 Comparison of x, y, and z-axis displacements among different algorithms in the fr3/w/rpy sequence

图6 不同算法在fr3/w/rpy序列下的姿态角对比

Fig.6 Comparison of attitude angles among different algorithms in fr3/w/rpy sequences

图7 不同算法在高动态序列fr3/w/xyz下的轨迹

Fig.7 Trajectories among different algorithms under high dynamic sequence fr3/w/xyz

图8 移动机器人的真实运动轨迹

Fig.8 Real motion trajectory of the mobile robot

图9 在真实场景下的特征提取

Fig.9 Feature extraction in the real scene

图10 本文算法和ORB-SLAM2的三维轨迹对比

Fig.10 Three-dimensional trajectory comparison between the proposed algorithm and ORB-SLAM2

图11 fr3/w/static子序列

Fig.11 fr3/w/static subsequence

图12 本文算法生成的点云和八叉树地图

Fig.12 Point cloud and octree maps generated by the proposed algorithm

参考文献 26

1	陈首彬. 激光LiDAR/视觉融合的SLAM(LV-SLAM)关键技术研究. 测绘学报, 2023, 52(1): 169. URL
	CHEN S B. Research on key technologies of laser LiDAR/visual fusion SLAM(LV-SLAM). Journal of Surveying and Mapping, 2023, 52(1): 169. URL
2	LIU Z, SHI D, LI R H, et al. PLC-VIO: visual-inertial odometry based on point-line constraints. IEEE Transactions on Automation Science and Engineering, 2022, 19(3): 1880- 1897. doi: 10.1109/TASE.2021.3077026
3	MUR-ARTAL R, TARDOS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 2017, 33(5): 1255- 1262. doi: 10.1109/TRO.2017.2705103
4	ENGEL J, SCHÖPS T, CREMERS D. LSD-SLAM: large-scale direct monocular SLAM[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 834-849.
5	ENGEL J, KOLTUN V, CREMERS D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611- 625. doi: 10.1109/TPAMI.2017.2658577
6	张慧娟, 方灶军, 杨桂林. 动态环境下基于线特征的RGB-D视觉里程计. 机器人, 2019, 41(1): 75- 82. URL
	ZHANG H J, FANG Z J, YANG G L. RGB-D visual odometer in dynamic environments using line features. Robot, 2019, 41(1): 75- 82. URL
7	杨世强, 范国豪, 白乐乐, 等. 基于几何约束的室内动态环境视觉SLAM. 计算机工程与应用, 2021, 57(16): 203- 212. doi: 10.3778/j.issn.1002-8331.2005-0158
	YANG S Q, FAN G H, BAI L L, et al. Geometric constraints-based visual SLAM under dynamic indoor environment. Computer Engineering and Applications, 2021, 57(16): 203- 212. doi: 10.3778/j.issn.1002-8331.2005-0158
8	DAI W C, ZHANG Y, LI P, et al. RGB-D SLAM in dynamic environments using point correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 373- 389. doi: 10.1109/TPAMI.2020.3010942
9	YU C, LIU Z X, LIU X J. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//Proceedings of International Conference on Intelligent Robots and Systems. Washington D. C., USA: IEEE Press, 2018: 1168-1174.
10	BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM: tracking, mapping and inpainting in dynamic scenes. IEEE Robotics and Automation Letters, 2018, 3(4): 4076- 4083. doi: 10.1109/LRA.2018.2860039
11	ZHONG F W, WANG S, Z ZHANG Z Q. Detect-SLAM: making object detection and SLAM mutually beneficial[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2018: 1001-1010.
12	LIU W, ANGUELOV D, ERHAN D. SSD: single shot multibox detector[EB/OL]. [2023-03-01]. https://link.springer.com/content/pdf/10.1007/978-3-319-46448-0_2.pdf.
13	XIAO L H, WANG J G, QIU X S, et al. Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment. Robotics and Autonomous Systems, 2019, 117, 1- 16. doi: 10.1016/j.robot.2019.03.012
14	HE K M, GKIOXARI G, DOLLÁR O, et al. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 42(2): 386- 397.
15	HU Z, ZHAO J, LUO Y, et al. Semantic SLAM based on improved DeepLabv3⁺ in dynamic scenarios. IEEE Access, 2022, 10, 21160- 21168. doi: 10.1109/ACCESS.2022.3154086
16	BOTTLENECKS M I R A. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 18-23.
17	ZHANG X, LI J, HUA Z. MRSE-Net: multiscale residuals and SE-attention network for water body segmentation from satellite images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15, 5049- 5064. doi: 10.1109/JSTARS.2022.3185245
18	SHARMA S, KUMAR S. The Xception model: a potential feature extractor in breast cancer histology images classification. ICT Express, 2022, 8(1): 101- 108. doi: 10.1016/j.icte.2021.11.010
19	PAN Z, HOU J, YU L. Optimization RGB-D 3D reconstruction algorithm based on dynamic SLAM. IEEE Transactions on Instrumentation and Measurement, 2023, 72, 1- 13.
20	ZHOU P, LIU Y, MENG Z. PointSLOT: real-time simultaneous localization and object tracking for dynamic environment. IEEE Robotics and Automation Letters, 2023, 8(5): 2645- 2652. doi: 10.1109/LRA.2023.3256919
21	CAI Y Q, ZHOU W J, ZHANG L T, et al. DHFNet: dual-decoding hierarchical fusion network for RGB-thermal semantic segmentation. The Visual Computer, 2023, 6, 1- 11.
22	CHEN L, WANG Y, MIAO Z, et al. Transformer-based imitative reinforcement learning for multi-robot path planning. IEEE Transactions on Industrial Informatics, 2023, 46, 1- 10.
23	XIE H, ZHANG D, WANG J, et al. Semi-direct multimap SLAM system for real-time sparse 3D map reconstruction. IEEE Transactions on Instrumentation and Measurement, 2023, 72, 1- 13.
24	ROSINOL A, LEONARD J J, CARLONE L. Probabilistic volumetric fusion for dense monocular SLAM[C]//Proceedings of Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2023: 3097-3105.
25	CHENG S, SUN C, ZHANG S, et al. SG-SLAM: a real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information. IEEE Transactions on Instrumentation and Measurement, 2023, 72, 1- 12.
26	WANG J, XU M, ZHAO G, et al. Feature- and distribution-based LiDAR SLAM with generalized feature representation and heuristic nonlinear optimization. IEEE Transactions on Instrumentation and Measurement, 2023, 72, 1- 15.

[1]	周秦源, 邓越平, 张磊, 张陈, 卢日荣, 胡贤哲. 融合光流与多视角几何的动态视觉SLAM系统[J]. 计算机工程, 2024, 50(5): 250-259.
[2]	徐春波, 闫娟, 杨慧斌, 王博, 吴晗. 基于目标检测和语义分割的视觉SLAM算法[J]. 计算机工程, 2023, 49(8): 199-206, 214.
[3]	房立金, 王科棋. 一种结合深度学习的运动重检测视觉SLAM算法[J]. 计算机工程, 2022, 48(5): 18-26.
[4]	王书朋, 贺瑞, 王瑜婧, 赵瑶. 中值直方图均衡的动态场景多曝光图像融合算法[J]. 计算机工程, 2022, 48(10): 224-229.
[5]	张晨阳, 黄腾, 吴壮壮. 基于K-Means聚类与深度学习的RGB-D SLAM算法[J]. 计算机工程, 2022, 48(1): 236-244,252.
[6]	余东应, 刘桂华, 曾维林, 冯波, 张文凯. 自适应窗隔匹配与深度学习相结合的RGB-D SLAM算法[J]. 计算机工程, 2021, 47(8): 224-233.
[7]	张金凤, 石朝侠, 王燕清. 动态场景下基于视觉特征的SLAM方法[J]. 计算机工程, 2020, 46(10): 95-102.
[8]	俞玉瑾,韩军,赵庆喜,张红梅. 基于IHDR的自主学习巡检技术研究[J]. 计算机工程, 2019, 45(4): 311-315,320.
[9]	曾维林, 刘桂华, 陈豪. 基于概率运动统计特征匹配的单目视觉SLAM[J]. 计算机工程, 2019, 45(12): 222-231,236.
[10]	张超, 杨晶晶, 王盛, 陈更生. 基于动态场景估计的自适应图像增强算法[J]. 计算机工程, 2013, 39(5): 34-41.
[11]	方磊;王宏远;徐帆;田文. 可并行迭代式图像序列射影重建策略[J]. 计算机工程, 2008, 34(9): 16-18.

选择文件类型/文献管理软件名称

选择包含的内容

动态场景下基于语义分割的视觉SLAM方法

Visual SLAM Method Based on Semantic Segmentation in Dynamic Scenes

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 26

相关文章 11

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

动态场景下基于语义分割的视觉SLAM方法

Visual SLAM Method Based on Semantic Segmentation in Dynamic Scenes

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 26

相关文章 11

编辑推荐

Metrics

本文评价