作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (1): 236-244,252. doi: 10.19678/j.issn.1000-3428.0060052

• 图形图像处理 • 上一篇    下一篇

基于K-Means聚类与深度学习的RGB-D SLAM算法

张晨阳, 黄腾, 吴壮壮   

  1. 河海大学 地球科学与工程学院, 南京 211100
  • 收稿日期:2020-11-18 修回日期:2021-01-22 发布日期:2021-01-27
  • 作者简介:张晨阳(1991-),男,博士研究生,主研方向为视觉SLAM;黄腾,研究员、博士生导师;吴壮壮,硕士研究生。
  • 基金资助:
    中央高校基本科研业务费专项资金(B200203106);江苏省研究生科研创新计划项目(KYCX20_0485)。

RGB-D SLAM Algorithm Based on K-Means Clustering and Deep Learning

ZHANG Chenyang, HUANG Teng, WU Zhuangzhuang   

  1. School of Earth Sciences and Engineering, Hohai University, Nanjing 211100, China
  • Received:2020-11-18 Revised:2021-01-22 Published:2021-01-27

摘要: 传统的RGB-D视觉同时定位与制图(SLAM)算法在动态场景中识别动态特征时会产生数据错误关联,导致视觉SLAM估计姿态精度退化。提出一种适用于动态场景的RGB-D SLAM算法,利用全新的跨平台神经网络深度学习框架检测场景中的动态语义特征,并分割提取对应的动态语义特征区域。结合深度图像的K均值聚类算法和动态语义特征区域对点特征深度值进行聚类,根据聚类结果剔除动态特征点,同时通过剩余特征点计算RGB-D相机的位姿。实验结果表明,相比ORB-SLAM2、OFD-SLAM、MR-SLAM等算法,该算法能够减小动态场景下的跟踪误差,提高相机位姿估计的精度和鲁棒性,其在TUM动态数据集上相机绝对轨迹的均方根误差约为0.019 m。

关键词: 同时定位与制图, 动态场景, 深度学习, 目标检测, K均值聚类

Abstract: The traditional RGB-D visual Simultaneous Localization and Mapping(SLAM) algorithms often generate wrong data associations when recognizing dynamic features in dynamic scenarios, which leads to a loss in the accuracy of posture estimation.To address the problem, a new RGB-D SLAM algorithm is proposed for dynamic scenarios.The naive convolutional neural network deep learning framework is used to detect the dynamic semantic features in the scenario, and then segment and extract the corresponding dynamic semantic feature areas.Next, the K-Means clustering algorithm and the dynamic semantic feature areas are both used to cluster the depth values of point features.Based on the clustering results, the dynamic feature points are removed, and the remaining feature points are used to calculate the posture of the RGB-D camera.The experimental results show that compared with ORB-SLAM2, OFD-SLAM, MR-SLAM and other algorithms, the proposed algorithm can reduce the tracking errors in dynamic scenarios, and improves the accuracy and robustness of camera posture estimation.Its root mean square error of camera absolute trajectory is 0.019 m on the TUM dataset.

Key words: Simultaneous Localization and Mapping(SLAM), dynamic scenario, Deep Learning(DL), target detection, K-Means clustering

中图分类号: