作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (7): 228-234,242. doi: 10.19678/j.issn.1000-3428.0055428

• 图形图像处理 • 上一篇    下一篇

基于非对称空间金字塔池化的立体匹配网络

王金鹤, 苏翠丽, 孟凡云, 车志龙, 谭浩, 张楠   

  1. 青岛理工大学 信息与控制工程学院, 山东 青岛 266000
  • 收稿日期:2019-07-09 修回日期:2019-08-20 发布日期:2019-08-26
  • 作者简介:王金鹤(1963-),男,教授,主研方向为图像处理、模式识别;苏翠丽,硕士研究生;孟凡云,讲师;车志龙、谭浩,硕士研究生;张楠,讲师。
  • 基金资助:
    国家自然科学基金(31271077);山东省高等学校科技计划项目(J17KA061)。

Stereo Matching Network Based on Asymmetric Spatial Pyramid Pooling

WANG Jinhe, SU Cuili, MENG Fanyun, CHE Zhilong, TAN Hao, ZHANG Nan   

  1. School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong 266000, China
  • Received:2019-07-09 Revised:2019-08-20 Published:2019-08-26

摘要: 卷积神经网络因具有强大的表征能力而被广泛用于图像处理算法,但其在处理过程中存在耗时和信息损失等不足。为此,提出一种基于非对称空间金字塔池化模型的卷积神经网络结构。设计非对称金字塔池化方法融入立体匹配网络,以获取更详细的图像特征信息。分别叠加卷积核为3×3和1×1的卷积层,用于融合多尺度信息和提升网络收敛速度,同时将网络结构由4层增加至7层,以提高匹配精度。在KITTI和Middlebury数据集上进行视差预测,实验结果表明,与基准网络相比,该网络结构可使收敛时间缩短约50.1%,匹配错误率从6.65%降低至4.78%,在立体匹配中获得更平滑的视差效果。

关键词: 卷积神经网络, 非对称空间金字塔池化, 多尺度融合, 信息损失, 立体匹配

Abstract: Convolutional Neural Network(CNN) is often used in image processing algorithms because of its excellent representation capabilities,but the process is time-consuming and often results in information loss.To address the problem,this paper proposes a CNN structure based on Asymmetric Spatial Pyramid Pooling(ASPP) model.An ASPP method is designed to be integrated with the stereo matching network to obtain more specific information about image features.Then convolutional layers with a 3×3 convolution kernel are superposed on those with a 1×1 convolutional kernel for multi-scale information fusion and improvement of network convergence speed.Also,the number of network layers is increased from four layers to seven layers to improve the matching accuracy.The parallax prediction is performed on the KITTI and Middlebury data sets.Experimental results show that,compared with the benchmark network,the proposed network structure shortens the convergence time by about 50.1% and reduces the matching error rate from 6.65% to 4.78%,achieving a smoother parallax effect in stereo matching.

Key words: Convolutional Neural Network(CNN), Asymmetric Spatial Pyramid Pooling(ASPP), multi-scale fusion, information loss, stereo matching

中图分类号: