作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 206-216. doi: 10.19678/j.issn.1000-3428.0064097

• 图形图像处理 • 上一篇    下一篇

结合图像特征迁移的光场深度估计方法

罗少聪1,2, 张旭东1,2, 万乐1,2, 谢林芳1,2, 黎书玉1,2   

  1. 1. 合肥工业大学 计算机与信息学院, 合肥 230601;
    2. 工业安全与应急技术安徽省重点实验室, 合肥 230009
  • 收稿日期:2022-03-04 修回日期:2022-05-12 发布日期:2022-05-25
  • 作者简介:罗少聪(1996-),男,硕士研究生,主研方向为光场深度估计、计算机视觉;张旭东(通信作者),教授、博士;万乐、谢林芳、黎书玉,硕士研究生。
  • 基金资助:
    国家自然科学基金(61876057、61971177);安徽省重点研发计划科技强警专项(202004d07020012)。

Light Field Depth Estimation Method Combining Image Feature Transfer

LUO Shaocong1,2, ZHANG Xudong1,2, WAN Le1,2, XIE Linfang1,2, LI Shuyu1,2   

  1. 1. School of Computer and Information, Hefei University of Technology, Hefei 230601, China;
    2. Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei 230009, China
  • Received:2022-03-04 Revised:2022-05-12 Published:2022-05-25

摘要: 光场相机可以通过单次曝光同时采集空间中光线的位置信息和角度信息,在深度估计领域具有独特优势。目前光场真实场景数据集的深度标签难以获取且准确度不高,因此现有的多数光场深度估计方法依赖光场合成场景数据集进行训练,但合成数据集与真实数据集在图像特征分布上的差异,导致网络在将子孔径图像与深度图之间的映射关系应用于真实数据集时容易出现偏差。提出一种新的光场深度估计方法,利用基于对抗学习的图像翻译网络,使合成场景子孔径图像逼近真实场景子孔径图像的特征分布。在图像翻译网络中实施多视图角度一致性约束,保证图像翻译前后不同视角子孔径图像之间的视差关系保持不变。设计一种多通道密集连接深度估计网络,利用多通道输入模块充分提取不同方向子孔径图像堆栈特征,并通过密集连接模块进行特征融合,提升网络特征提取和特征传递的效率。在光场合成数据集4D Light Field Benchmark和光场真实数据集Stanford Lytro Light Field上的实验结果表明:与Baseline网络相比,该网络的均方误差和坏像素率平均降低23.3%和8.6%;与EPINET、EPI_ORM、EPN+OS+GC等方法相比,基于该网络的估计方法有效提升了深度估计的准确度,具有良好的鲁棒性和泛化能力。

关键词: 光场, 深度估计, 对抗学习, 特征迁移, 角度一致性, 密集连接模块

Abstract: Light-field cameras can simultaneously collect the position and angle details of light in space through a single exposure, which possesses unique advantages in the field of depth estimation.As the depth labels of light-field real-scene datasets are difficult to obtain and the accuracy is not high, most existing light-field depth estimation methods rely on a large number of light-field synthetic scene datasets for training.However, the difference in the image feature distributions between the synthetic and real datasets leads to deviations in the mapping relationship between the sub-aperture image and depth map learned by the network in the synthetic dataset when applied to the real dataset.A new light-field depth estimation method is proposed in this study.First, the image translation network based on adversarial learning is used to approximate the feature distribution of the real-scene image using the synthetic-scene-centered sub-aperture image, thereby implementing the multi-view angle consistency constraint in the image translation network to ensure the sub-apertures of different views before and after image translation.The disparity relationship between the images does not change.Second, a multi-channel Dense Connection(DC) depth estimation network is designed, in which the multi-channel input module extracts the features of sub-aperture image stacks along different directions.The feature fusion is performed using the DC module, which improves the efficiencies of network feature extraction and feature transfer.Finally, the experimental results of the light-field synthetic dataset, i.e., 4D Light Field Benchmark, and light-field real dataset, i.e., Stanford Lytro Light Field, indicate that the values of the Mean Square Error(MSE) and Bad Pixel(BP) indicators of the proposed network are reduced by 23.3% and 8.6% compared with the Baseline network results, which are comparable to the existing ones.Compared with the EPINET, EPI_ORM, and EPN+OS+GC methods, the proposed estimation method based on the network above effectively improves depth estimation accuracy and demonstrates better robustness and generalization ability.

Key words: light field, depth estimation, adversarial learning, feature transfer, angle consistency, Dense Connection(DC) module

中图分类号: