融合光流与多视角几何的动态视觉SLAM系统

doi:10.19678/j.issn.1000-3428.0067775

摘要/Abstract

摘要： 视觉同步定位与地图构建(SLAM)在动态干扰的情况下,导致定位精度下降且无法准确构建静态地图,提出一种结合光流和多视角几何的动态视觉SLAM系统,该系统是在ORB-SLAM2的基础上进行改进的。在追踪线程中引入处理后的光流信息,结合多视图几何,得到动态区域掩码对视野内图像帧进行分割,实现动态区域检测并滤除动态区域中的特征点,在保证视觉SLAM系统实时性的同时提高追踪准确度,替换原本的地图构建线程。在新的地图构建线程中,引入光流信息及MobileNetV2实例分割网络。利用实例分割网络分割结果结合光流动态区域掩码对获取到的有序点云逐层分割,解决地图构建中动态物体造成的"拖影"问题。同时对分割后的点云团融合语义信息,最终构建静态语义八叉树地图。在TUM Dynamic Objects数据集上的实验结果表明,相较于ORB-SLAM2,在高动态场景序列测试中,该算法的定位精度平均提升70.4%,最高可提升90%。

关键词: 同步定位与地图构建, 光流, 多视角几何, 动态场景, 运动物体检测, 实例分割, 点云分割

Abstract: Visual Simultaneous Localization and Mapping (SLAM) reduces the positioning accuracy and cannot accurately construct static maps under dynamic interference. A dynamic visual SLAM system combining optical flow and multi-view geometry is proposed, which is improved based on ORB-SLAM2. It introduces processed optical flow information into the tracking thread, which, when combines with multi-view geometry, yields dynamic-region masks for segmenting image frames in the field of view, thus achieving dynamic-region detection and the filtering of feature points in dynamic regions. This improves the tracking accuracy while ensuring the real-time performance of the visual SLAM system by replacing the original map's construction thread. In the new map's construction thread, optical flow information and the MobileNetV2 instance segmentation network are introduced. By combining the segmentation results of the instance segmentation network with the optical flow dynamic-region mask, an ordered point cloud is obtained and segmented by layer to solve the ″dragging″ issue caused by dynamic objects during map construction. Simultaneously, semantic information is fused into the segmented point-cloud cluster to construct a static semantic OctoMap. Experimental results on the TUM Dynamic Objects dataset show that compared with ORB-SLAM2, the positioning accuracy of the proposed algorithm improves by an average of 70.4%, with a maximum improvement of 90% in high dynamic scene sequence testing.

Key words: Simultaneous Localization and Mapping(SLAM), optical flow, multi-view geometry, dynamic scenes, moving object detection, instance segmentation, point cloud segmentation

中图分类号:

TP391

周秦源, 邓越平, 张磊, 张陈, 卢日荣, 胡贤哲. 融合光流与多视角几何的动态视觉SLAM系统[J]. 计算机工程, 2024, 50(5): 250-259.

ZHOU Qinyuan, DENG Yueping, ZHANG Lei, ZHANG Chen, LU Rirong, HU Xianzhe. Dynamic Visual SLAM System Integrating Optical Flow and Multi-View Geometry[J]. Computer Engineering, 2024, 50(5): 250-259.

https://www.ecice06.com/CN/Y2024/V50/I5/250

参考文献

[1] PRITSKER A A B. Introduction to simulation and SLAM II[M]. New York, USA:John Wiley & Sons, Inc., 1984.
[2] 王晨旭. SLAM 系统关键技术研究[D]. 北京:北京邮电大学, 2021. WANG C X. Research on key technologies of SLAM system[D]. Beijing:Beijing University of Posts and Telecommunications, 2021.(in Chinese)
[3] 毛军,付浩,褚超群等.惯性/视觉/激光雷达SLAM技术综述[J].导航定位与授时,2022,9(4):17-30. MAO J, FU H, CHU C Q, et al. A review of simultaneous location and mapping based on intertial-visual-lidar fusion[J].Navigation, Positioning and Timing,2022,9(4):17-30.(in Chinese)
[4] 杨世强,范国豪,白乐乐,等.基于几何约束的室内动态环境视觉SLAM[J].计算机工程与应用,2021,57(16):203-212. YANG S Q, FAN G H, BAI L L, et al. Geometric constraints-based visual SLAM under dynamic indoor environment[J].Computer Engineering and Applications,2021,57(16):203-212. (in Chinese)
[5] 林凯,梁新武,蔡纪源.基于重投影深度差累积图与静态概率的动态RGB-D SLAM算法[J].浙江大学学报(工学版),2022,56(6):1062-1070. LIN K, LIANG X W, CAI J Y. Dynamic RGB-D SLAM algorithm based on reprojection depth difference cumulation map and static probability[J].Journal of Zhejiang University(Engineering Science),2022,56(6):1062-1070. (in Chinese)
[6] 高成强,张云洲,王晓哲, 等.面向室内动态环境的半直接法RGB-D SLAM算法[J].机器人,2019,41(3):372-383. GAO C Q, ZHANG Y Z, WANG X Z,et al. Semi-direct RGB-D SLAM algorithm for dynamic indoor environment[J].Robot,2019,41(3):372-383. (in Chinese)
[7] 高逸,王庆,杨高朝,等.基于几何约束和目标检测的室内动态SLAM[J].全球定位系统,2022,47(5):51-56. GAO Y, WANG Q, YANG G C, et al. Indoor dynamic SLAM based on geometric constraints and target detection[J].GNSS World of Chins, 2022,47(5):51-56. (in Chinese)
[8] 杜晓英, 袁庆霓, 齐建友, 等. 动态场景下基于语义分割的视觉SLAM方法[J].计算机工程, 2024,50(3):242-249. DU X Y, YUAN Q N, QI J Y, et al. Visual SLAM method based on semantic segmentation in dynamic scene[J].Computer Engineering, 2024,50(3):242-249. (in Chinese)
[9] 王金戈,邹旭东,仇晓松,等.动态环境下结合语义的鲁棒视觉SLAM[J].传感器与微系统,2019,38(5):125-128,132. WANG J G, ZOU X D, QIU X S, et al. Robust visual SLAM with semantics in dynamic environment[J].Sensors and Microsystems,2019,38(5):125-128, 132. (in Chinese)
[10] 王霞, 左一凡. 视觉 SLAM 研究进展[J]. 智能系统学报, 2020, 15(5):825-834. WANG X, ZUO Y F. Research progress on visual SLAM[J]. Journal of Intelligent Systems, 2020, 15(5):825-834. (in Chinese)
[11] 徐陈, 周怡君, 罗晨. 动态场景下基于光流和实例分割的视觉 SLAM 方法[J]. 光学学报, 2022, 42(14):147-159. XU C, ZHOU Y J, LUO C. Visual SLAM method based on optical flow and instance segmentation for dynamic scenes[J]. Acta Optica Sinica, 2022, 42(14):147-159. (in Chinese)
[12] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6):1229-1251. ZHOU F Y, JIN L P, DONG J. Review of convolutional neural networks[J]. Chinese Journal of Computers, 2017, 40(6):1229-1251. (in Chinese)
[13] HORNUNG A, WURM K M, BENNEWITZ M, et al. OctoMap:an efficient probabilistic 3D mapping framework based on octrees[J]. Autonomous Robots, 2013, 34:189-206.
[14] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2018:4510-4520.
[15] FISCHLER M A, BOLLES R C. Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6):381-395.
[16] HORN B K P. Closed-form solution of absolute orientation using unit quaternions[J]. Journal of the Optical Society of America A, 1987, 4(4):629-642.
[17] 单欣, 王耀明, 董建萍. 基于RANSAC算法的基本矩阵估计的匹配方法[J]. 上海电机学院学报, 2006(4):66-69. SHAN X, WANG Y M, DONG J P. The matching method of based on RANSAC algorithm for estimation of the fundamental matrix[J]. Journal of Shanghai Dianji University, 2006 (4):66-69.(in Chinese)
[18] 席志红, 李爽, 曾继琴, 等. 一种改进的 PnP 问题求解算法研究[J]. 应用科技, 2018, 45(4):56-60. XI Z H, LI S, ZENG J Q, et al. An improved algorithm for solving PnP Problem solving[J]. Applied Science and Technology, 2018, 45(4):56-60. (in Chinese)
[19] BEAUCHEMIN S S, BARRON J L. The computation of optical flow[J]. ACM Computing Surveys, 1995, 27(3):433-466.
[20] 刘钰嵩,何丽,袁亮,等.动态场景下基于光流的语义RGBD-SLAM算法[J].仪器仪表学报,2022,43(12):139-148. LIU Y D, HE L, YUAN L, et al. Semantic RGBD-SLAM in dynamic scene based on optical flow[J].Chinese Journal of Scientific Instrument,2022,43(12):139-148.(in Chinese)
[21] AGARWAL A, JAWAHAR C V, NARAYANAN P J. A survey of planar homography estimation techniques[J].Centre for Visual Information Technology, 2005(3):1-25.
[22] FARNEBÄCK G. Two-frame motion estimation based on polynomial expansion[C]//Proceedings of Scandinavian Conference on Image Analysis. Berlin, Germany:Springer, 2003:363-370.
[23] 王耀贵. 图像高斯平滑滤波分析[J]. 计算机与信息技术, 2008 (8):79-81. WANG Y G. Image Gaussian smoothing filter analysis[J]. Computer and Information Technology, 2008(8):79-81. (in Chinese)
[24] 郑萌萌,钱慧芳,周璇.基于监控视频的Farneback光流算法的人体异常行为检测[J].国外电子测量技术,2021,40(3):16-22. ZHENG M M, QIAN H F, ZHOU X. Human abnormal action detection based on Farneback optical flow arithmetic of surveillance video[J].Foreign Electronic Measurement Technology, 2021,40(3):16-22.(in Chinese)
[25] HARTLEY R, ZISSERMAN A. Multiple view geometry in computer vision[M]. Cambridge, USA:MIT Press, 2004.

选择文件类型/文献管理软件名称

选择包含的内容