作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (10): 1-12. doi: 10.19678/j.issn.1000-3428.0064294

• 热点与综述 • 上一篇    下一篇

基于深度学习的双目立体匹配方法综述

尹晨阳, 职恒辉, 李慧斌   

  1. 西安交通大学 数学与统计学院, 西安 710049
  • 收稿日期:2022-03-24 修回日期:2022-06-18 发布日期:2022-07-13
  • 作者简介:尹晨阳(1998—),男,硕士研究生,主研方向为立体匹配、三维重建;职恒辉,硕士研究生;李慧斌(通信作者),副教授、博士。
  • 基金资助:
    国家自然科学基金面上项目(61976173);教育部-中国移动人工智能建设项目(MCM20190701)。

Survey of Binocular Stereo-matching Methods Based on Deep Learning

YIN Chenyang, ZHI Henghui, LI Huibin   

  1. School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049, China
  • Received:2022-03-24 Revised:2022-06-18 Published:2022-07-13

摘要: 双目立体匹配是计算机视觉领域的经典问题,在自动驾驶、遥感、机器人感知等诸多任务中得到广泛应用。双目立体匹配的主要目标是寻找双目图像对中同名点的对应关系,并利用三角测量原理恢复图像深度信息。近年来,基于深度学习的立体匹配方法在匹配精度和匹配效率上均取得了远超传统方法的性能表现。将现有基于深度学习的立体匹配方法分为非端到端方法和端到端方法。基于深度学习的非端到端方法利用深度神经网络取代传统立体匹配方法中的某一步骤,根据被取代步骤的不同,该类方法被分为基于代价计算网络、基于代价聚合网络和基于视差优化网络的3类方法。基于深度学习的端到端方法根据代价体维度的不同可分为基于3D代价体和基于4D代价体的方法。从匹配精度、时间复杂度、应用场景等多个角度对非端到端和端到端方法中的代表性成果进行分析,并归纳各类方法的优点以及存在的局限性。在此基础上,总结基于深度学习的立体匹配方法当前面临的主要挑战并展望该领域未来的研究方向。

关键词: 计算机视觉, 深度学习, 双目图像, 立体匹配方法, 图像深度

Abstract: Binocular stereo matching is a classical problem in the field of computer vision and has been widely used in many tasks such as automated driving, remote sensing, and robot perception.The main goal of binocular stereo matching is to identify the corresponding relationship of same-named points in a binocular image pair and to recover image depth information based on the triangulation principle.In recent years, stereo-matching methods based on deep learning have achieved much better performance than traditional methods in terms of matching accuracy and efficiency.Existing stereo-matching methods based on deep learning are divided into non-end-to-end and end-to-end methods.The non-end-to-end methods based on deep learning use deep neural networks to replace steps in traditional stereo-matching methods.Based on these different steps, these methods can be divided into three types of networks:cost-based computing, cost-based aggregation, and disparity-based optimization.The end-to-end methods based on deep learning can be divided into 3D and 4D cost-volume-based methods according to different cost-volume dimensions.The representative methods of non- and end-to-end methods are analyzed in terms of matching accuracy, time complexity, and application scenarios, and the advantages and limitations of various methods are summarized.Accordingly, the main challenges of stereo-matching methods based on deep learning are summarized and future research directions in the field are prospected.

Key words: computer vision, deep learning, binocular images, stereo-matching method, image depth

中图分类号: