作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 242-249. doi: 10.19678/j.issn.1000-3428.0067370

• 图形图像处理 • 上一篇    下一篇


杜晓英1, 袁庆霓1,2,3,*(), 齐建友1, 王晨1, 杜飞龙1, 任澳1   

  1. 1. 贵州大学现代制造技术教育部重点实验室, 贵州 贵阳 550025
    2. 贵州大学机械工程学院, 贵州 贵阳 550025
    3. 贵州大学省部共建公共大数据国家重点实验室, 贵州 贵阳 550025
  • 收稿日期:2023-04-06 出版日期:2024-03-15 发布日期:2023-06-16
  • 通讯作者: 袁庆霓
  • 基金资助:
    国家自然科学基金(52165063); 国家自然科学基金(52065010); 贵州省科技厅资助项目([2022]重点024); 贵州省科技厅资助项目([2022]一般140); 贵州省科技厅资助项目([2023]一般094); 贵州省科技厅资助项目([2023]一般025); 贵州大学实验室开放资助项目(SYSKF2023-089)

Visual SLAM Method Based on Semantic Segmentation in Dynamic Scenes

Xiaoying DU1, Qingni YUAN1,2,3,*(), Jianyou QI1, Chen WANG1, Feilong DU1, Ao REN1   

  1. 1. Key Laboratory of Advanced Manufacturing Technology, Ministry of Education, Guizhou University, Guiyang 550025, Guizhou, China
    2. School of Mechanical Engineering, Guizhou University, Guiyang 550025, Guizhou, China
    3. State Key Laboratory of Public Big Data Jointly Built by Provincial and Ministerial Governments, Guizhou University, Guiyang 550025, Guizhou, China
  • Received:2023-04-06 Online:2024-03-15 Published:2023-06-16
  • Contact: Qingni YUAN



关键词: DeepLabv3plus网络, 视觉同步定位与建图, 多视图几何, 动态场景, 语义地图


A semantic visual SLAM algorithm based on an improved semantic segmentation network DeepLabv3plus and multiview geometry is designed to address the issues of poor robustness and susceptibility to interference from dynamic objects in visual Synchronous Localization And Map (SLAM) construction in dynamic scenes. Based on the semantic segmentation network DeepLabv3plus, a lightweight convolutional network MobileNetV2 is used for feature extraction, and depthwise separable convolutions are used instead of standard convolutions in the Atrous Spatial Pyramid Pooling (ASPP) module. Simultaneously, an attention mechanism is introduced to propose an improved semantic segmentation network DeepLabv3plus. Combining the improved semantic segmentation network DeepLabv3plus with multiview geometry, a dynamic point detection method is proposed to enhance the robustness of visual SLAM in dynamic scenes. On this basis, a three-dimensional semantic static map containing both semantic and geometric information is constructed. The experimental results on the TUM dataset demonstrate that compared with ORB-SLAM2, the highest Root Mean Square Error (RMSE) and Standard Deviation (SD) values increased by more than 98% and 97%, respectively.

Key words: DeepLabv3plus network, visual Synchronous Localization And Map (SLAM), multiview geometry, dynamic scenes, semantic map