作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 223-231. doi: 10.19678/j.issn.1000-3428.0065628

• 图形图像处理 • 上一篇    下一篇

基于空洞卷积与注意力模块的立体匹配算法

刘志浩, 孟凡云*, 王金鹤, 张楠   

  1. 青岛理工大学 信息与控制工程学院, 山东 青岛 266520
  • 收稿日期:2022-08-30 出版日期:2023-08-15 发布日期:2023-08-16
  • 通讯作者: 孟凡云
  • 作者简介:

    刘志浩(1997—),男,硕士研究生,主研方向为立体匹配

    王金鹤,教授、博士

    张楠,讲师

  • 基金资助:
    山东省自然科学基金(ZR2019BA014)

Stereo Matching Algorithm Based on Atrous Convolution and Attention Module

Zhihao LIU, Fanyun MENG*, Jinhe WANG, Nan ZHANG   

  1. School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, Shandong, China
  • Received:2022-08-30 Online:2023-08-15 Published:2023-08-16
  • Contact: Fanyun MENG

摘要:

基于卷积神经网络的立体匹配算法大多需要较大的感受野,但多数算法在扩大感受野的同时参数量也容易剧增,导致算法对训练数据的规模要求较高。提出一种基于空洞卷积和注意力模块的立体匹配算法,采用空洞卷积模块,将残差结构和空洞卷积相结合,以在较少参数量的情况下扩大网络的感受野。使用注意力模块,通过不同层次的卷积整合多层次的信息,增加所提取信息的完整性。采用空间金字塔池化模块,通过帯权的金字塔池化扩大模型的感受野,并赋予不同层次信息不同的重要性程度。实验结果表明,在相同数据集和训练次数的情况下,所提算法相对于DispNetC等其他算法具有较快的收敛速度,且结构简单,参数量较少,适用于小样本数据。

关键词: 立体匹配, 小样本数据, 空洞卷积, 注意力模块, 金字塔池化

Abstract:

Most of the stereo matching algorithms based on convolutional neural networks require a large receptive field. However, the number of parameters in most algorithms is easy to increase when the receptive field is enlarged, which leads to high requirements on the scale of training data. A stereo matching algorithm, based on atrous convolution and attention module, is proposed. An atrous convolution module is used to combine residual structure and atrous convolution to enlarge the receptive field of the network with fewer parameters. The attention module is used to integrate multiple levels of information via different levels of convolution to increase the integrity of the extracted information. The spatial pyramid pool module is used to enlarge the receptive field of the model through the pyramid pool with the right, and different levels of information have different importance. The experimental results show that the proposed algorithm has a faster convergence speed than DispNetC and other algorithms with the same data set and training times. Moreover, it has a simple structure, few parameters, and is suitable for small sample data.

Key words: stereo matching, small samples data, atrous convolution, attention module, pyramid pooling