摘要： 非视域成像NLOS(non-line-of-sight imaging)是一门综合成像和计算重构的技术，指在不直接拍摄场景的情况下，通过获取介质上隐藏场景的散射或反射信息来对其进行重建。目前的被动非视域成像还处于早期发展阶段，场景模型、目标信息重建等尚无系统的研究方法，缺乏对无遮挡且非自发光的场景的研究。为此，提出了一种针对该场景的非视域成像解决方案，方案分为两步，首先是基于光辐射理论，分析了此场景下漫反射面的成像与隐藏物体的形状的关系，确定了非视域成像模型与重建目标；其次是重构阶段，指出了现有的基于深度学习的重建方法在数据集制作方面没有遵循物理模型，导致无法对非学习库中的场景重建的问题，使用渲染软件结合MPEG7数据集生成符合实际物理意义的漫反射被动非视域全影数据集(Diffuse-Shadow-NLOS data，DS-NLOS)。提出了一种被动非视域重建网络框架Re-NLOS(restore non-line-of-sight network)，该网络框架采用了视觉Transformer(Vision Transformer,ViT)结构结合生成对抗网络(Conditional Adversarial Nets，GANs) ，提取采集的漫反射面图像的全局特征，从而恢复隐藏物体形状。在仿真图像上的重建结果表明，该方法能够从漫反射面恢复隐藏物体的形状信息，在测试集20个类别的物体上，平均峰值信噪比提高了5.85dB，平均SSIM提高了0.04，模型对真实室内场景也有一定的恢复能力。
Abstract: Non-line-of-sight imaging (NLOS) is a comprehensive imaging and computational reconstruction technique that involves reconstructing hidden scenes on a medium by acquiring scattered or reflected information, without directly capturing the scene. Passive NLOS imaging is currently in its early development stage, lacking systematic research methods for scene modeling and target information reconstruction. Research on unobstructed and non-self-emitting scenes is also limited. To address this, a NLOS imaging solution for such scenes has been proposed. The solution consists of two steps.Firstly, based on the theory of light radiation, the relationship between the imaging of diffuse reflection surfaces in this scene and the shape of hidden objects is analyzed to determine the NLOS imaging model and reconstruction target. Secondly, in the reconstruction phase, it is pointed out that existing deep learning-based reconstruction methods do not adhere to the physical model in terms of dataset creation, which leads to the inability to reconstruct scenes outside the training set. To tackle this problem, a physically meaningful diffuse-shadow NLOS dataset (DS-NLOS) is generated using rendering software combined with the MPEG7 dataset. A passive NLOS reconstruction network framework called Re-NLOS (restore non-line-of-sight network) is proposed. This framework combines the visual Transformer (ViT) structure with conditional adversarial networks (GANs) to extract global features from captured images of diffuse reflection surfaces, thus recovering the shape of hidden objects.Reconstruction results on simulated images demonstrate that this method can recover the shape information of hidden objects from diffuse reflection surfaces. For a test set of 20 object categories, the average peak signal-to-noise ratio (PSNR) improved by 5.85 dB, and the average structural similarity index (SSIM) improved by 0.04. The model also exhibits some ability to recover real indoor scenes.