NeRF 3D Reconstruction Method Based on Cone Tracking and Network Decomposition

doi:10.19678/j.issn.1000-3428.0068291

Abstract

Abstract:

In computer vision, Neural Radiance Fields (NeRF) define processes that use spatial coordinates or other dimensions, such as time and camera pose, as input and simulate the objective function through a Multi-Layer Perceptron (MLP) network to generate the target scalar (color and depth). NeRF reconstructs 3D scenes well but blurs or distorts different resolutions and trains them slowly. To solve these two issues, this study proposes a NeRF 3D reconstruction method based on cone tracking and network decomposition. First, the cone-tracking method is used to project a cone for each pixel; the projected cone is cut into a series of cones, characterized along the cone, and the blur or artifact effect is reduced by efficiently rendering the anti-aliasing cone. To shorten the training time, the neural network of the original NeRF receiving five-dimensional data is decomposed into two networks using the network decomposition method, which effectively shortens the training time. Experimental results show that the proposed method improves the Peak Signal-to-Noise Ratio (PSNR) by 14.4%-24.6% compared with NeRF, F²-NeRF, and other algorithms in NeRF_Synthetic, LLFF, and Multiresolution datasets. The training time is also reduced, which allows the reconstruction of richer detailed features, better visual effects, and faster training speed.

Key words: Neural Radiation Field (NeRF), Multi-Layer Perceptron (MLP), 3D reconstruction, neural network, implicit reconstruction, cone tracking, network decomposition

摘要：

在计算机视觉领域, 神经辐射场(NeRF)是以空间坐标或者时间、相机位姿等其他维度作为输入, 通过多层感知机(MLP)网络模拟目标函数, 生成颜色、深度等目标标量的过程。NeRF的应用包括对三维场景进行高质量的重建, 而其在处理不同分辨率的场景时会产生过度模糊或者伪影的渲染效果, 且存在训练耗时较长的问题。为了解决上述问题, 提出基于锥形追踪和网络分解的NeRF三维重建方法。使用锥形追踪的方法, 为每个像素投射一个圆锥体, 并将投射的圆锥体切割成一系列的圆锥台, 沿着该圆锥体进行特征化, 通过高效渲染抗锯齿的圆锥台来降低模糊或者伪影效果。为了缩短训练时间, 使用网络分解的方法, 将原始NeRF接收5维数据的神经网络分解为两个网络, 有效地缩短训练时间。实验结果表明, 在NeRF_Synthetic、LLFF和Multiresolution数据集中, 相比于NeRF、F²-NeRF等方法, 所提方法的峰值信噪比(PSNR)提升了14.4%~24.6%, 能够重建出更丰富的细节特征, 视觉效果更好, 且训练时间大幅降低。

关键词: 神经辐射场, 多层感知机, 三维重建, 神经网络, 隐式重建, 锥形追踪, 网络分解

JING Weipeng, WANG Yuanfeng, LI Chao. NeRF 3D Reconstruction Method Based on Cone Tracking and Network Decomposition[J]. Computer Engineering, 2024, 50(10): 334-341.

景维鹏, 王源锋, 李超. 基于锥形追踪和网络分解的NeRF三维重建方法[J]. 计算机工程, 2024, 50(10): 334-341.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0068291

https://www.ecice06.com/EN/Y2024/V50/I10/334

Figures/Tables 10

Fig.1 Network structure of NeRF 3D reconstruction method based on cone tracing and network decomposition

Fig.2 Schematic diagram of MipMapping

Fig.3 Schematic diagram of cone tracking

Fig.4 Schematic diagram of network structure decomposition

Fig.5 Visual comparison results of images

Fig.6 Visual comparison results of images in ablation experiment

References 29

1	YUNIARTI A, SUCIATI N. A review of deep learning techniques for 3D reconstruction of 2D images[C]//Proceedings of the 12th International Conference on Information & Communication Technology and System. Washington D. C., USA: IEEE Press, 2019: 327-331.
2	MILZ S, ARBEITER G, WITT C, et al. Visual SLAM for automated driving: exploring the applications of deep learning[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington D. C., USA: IEEE Press, 2018: 247-257.
3	KHAN U, YASIN A, ABID M, et al. A methodological review of 3D reconstruction techniques in tomographic imaging. Journal of Medical Systems, 2018, 42 (10): 190. doi: 10.1007/s10916-018-1042-2
4	SRA M, GARRIDO-JURADO S, SCHMANDT C, et al. Procedurally generated virtual reality from 3D reconstructed physical space[C]//Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology. New York, USA: ACM Press, 2016: 191-200.
5	CHOY C B, XU D F, GWAK J, et al. 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction[EB/OL]. [2023-07-05]. https://arxiv.org/abs/1604.00449.
6	XU Q G, WANG W Y, CEYLAN D, et al. Disn: deep implicit surface network for high-quality single-view 3D reconstruction[EB/OL]. [2023-07-05]. http://arxiv.org/abs/1905.10711v5.
7	GENOVA K, COLE F, SUD A, et al. Local deep implicit functions for 3D shape[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 4857-4866.
8	CHEN Z Q, ZHANG H. Learning implicit fields for generative shape modeling[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 5939-5948.
9	TANCIK M, SRINIVASAN P P, MILDENHALL B, et al. Fourier features let networks learn high frequency functions in low dimensional domains[EB/OL]. [2023-07-05]. http://arxiv.org/abs/2006.10739v1.
10	MURTAGH F. Multilayer perceptrons for classification and regression. Neurocomputing, 1991, 2 (5/6): 183- 197.
11	范文卓, 吴涛, 许俊平, 等. 基于多分辨率特征融合的任意尺度图像超分辨率重建. 计算机工程, 2023, 49 (9): 217- 225. URL
	FAN W Z, WU T, XU J P, et al. Super-resolution reconstruction of arbitrary scale images based on multi-resolution feature fusion. Computer Engineering, 2023, 49 (9): 217- 225. URL
12	MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[EB/OL]. [2023-07-05]. https://arxiv.org/abs/2003.08934.
13	ZHU F, GUO S, SONG L, et al. Deep review and analysis of recent NeRFs. APSIPA Transactions on Signal and Information Processing, 2023, 12 (1): 1- 15.
14	WANG P, LIU Y, CHEN Z X, et al. F²-NeRF: fast neural radiance field training with free camera trajectories[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 4150-4159.
15	MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 2022, 41 (4): 1- 15.
16	YU A, YE V, TANCIK M, et al. pixelNeRF: neural radiance fields from one or few images[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 4578-4587.
17	XU Q G, XU Z X, PHILIP J, et al. Point-NeRF: point-based neural radiance fields[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 5438-5448.
18	BARRON J T, MILDENHALL B, TANCIK M, et al. Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 5855-5864.
19	STANLEY K O. Compositional pattern producing networks: a novel abstraction of development. Genetic Programming and Evolvable Machines, 2007, 8 (2): 131- 162. doi: 10.1007/s10710-007-9028-8
20	KAJIYA J T, VON HERZEN B P. Ray tracing volume densities. ACM SIGGRAPH Computer Graphics, 1984, 18 (3): 165- 174. doi: 10.1145/964965.808594
21	SCHONBERGER J L, FRAHM J M. Structure-from-motion revisited[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 4104-4113.
22	GAUTHIER A, FAURY R, LEVALLOIS J, et al. MIPNet. ACM Transactions on Graphics, 2022, 41 (6): 1- 12.
23	KUZNETSOV A. NeuMIP: multi-resolution neural materials[EB/OL]. [2023-07-05]. https://arxiv.org/abs/2104.02789.
24	KAJIYA J T. The rendering equation[C]//Proceedings of the 13th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA: ACM Press, 1986: 143-150.
25	WU L F, CAI G Y, ZHAO S, et al. Analytic spherical harmonic gradients for real-time rendering with many polygonal area lights. ACM Transactions on Graphics, 2020, 39 (4): 1- 14.
26	HUYNH-THU Q, GHANBARI M. Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 2008, 44 (13): 800. doi: 10.1049/el:20080522
27	惠子薇, 何坤, 冯犇, 等. 基于视觉特性的图像质量评价. 计算机工程, 2023, 49 (7): 189- 195. URL
	HUI Z W, HE K, FENG B, et al. Image quality assessment based on visual characteristics. Computer Engineering, 2023, 49 (7): 189- 195. URL
28	BAKUROV I, BUZZELLI M, SCHETTINI R, et al. Structural Similarity Index (SSIM) revisited: a data-driven approach. Expert Systems with Applications, 2022, 189, 116087.
29	ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 586-595.

[1]	GAO Yubao, WEN Zhicheng. Dual Decoder Image Denoising Method Based on Attention Mechanism [J]. Computer Engineering, 2024, 50(9): 324-332.
[2]	WANG Zhihao, QIAN Yuntao. Super-Resolution Reconstruction of Spatiotemporal Fusion for Dual-Stream Remote Sensing Images Based on Swin Transformer [J]. Computer Engineering, 2024, 50(9): 33-45.
[3]	LI Junjun, DONG Jiangang, LI Kun. Research on Kubernetes-based Cluster Energy-Saving Strategy [J]. Computer Engineering, 2024, 50(9): 82-91.
[4]	LI Zelin, LÜ Zhaofeng, CHEN Fuqiang, LI Ke. Entity Alignment Model Based on Multi-Hop Information Fusion [J]. Computer Engineering, 2024, 50(9): 142-152.
[5]	WANG Ruying, MA Jiajun, DONG Jianqiang, LIU Wanlong, ZHANG Haitao, YIN Kai, ZHAO Bochao. Industrial Load Forecasting Method Based on MTS-BiGRU-DMHSA [J]. Computer Engineering, 2024, 50(9): 169-178.
[6]	ZHANG Lu, TIAN Chunwei, SONG Huansheng, LIU Shigang. Multi-Level Dual-Tree Complex Wavelet Network for Low-Dose CT Image Denoising [J]. Computer Engineering, 2024, 50(9): 266-275.
[7]	Lei WANG, Shipeng DANG, Feng PAN. Model for Predicting Concealed Accessory Pathway Based on Convolutional Neural Network [J]. Computer Engineering, 2024, 50(8): 40-49.
[8]	Yunhang LI, Qing PAN, Nili TIAN. Hybrid Multi-Scale Medical Image Fusion Based on Structural Similarity Optimization [J]. Computer Engineering, 2024, 50(7): 264-270.
[9]	Zhengkang ZHANG, Dan YANG, Tiezheng NIE, Yue KOU. Self-Supervised Learning Based on Graph Structural Clustering for Disease Diagnosis Method [J]. Computer Engineering, 2024, 50(7): 360-371.
[10]	Xingyu HE, Yixin ZHOU, Dongxu LUO, Guisong YANG. Education Resource Recommendation Based on Graph Neural Network and Multi-Subject Rating [J]. Computer Engineering, 2024, 50(7): 13-22.
[11]	Lili GENG, Baoning NIU. Convolutional Neural Network Pruning Based on Channel Similarity Entropy [J]. Computer Engineering, 2024, 50(7): 133-143.
[12]	Yang ZHANG, Chang LIU, Shaoqing LI. Gate-Level Hardware Trojan Detection Method for Graph Neural Networks Based on Controllability Metrics [J]. Computer Engineering, 2024, 50(7): 164-173.
[13]	Ruiting NUI, Tianfeng YAN, Rui GAO, Yingzhi WANG. Deep Learning TCNN-MobileNet-Based Modulation Recognition Under Low Signal-to-Noise Radio [J]. Computer Engineering, 2024, 50(7): 204-215.
[14]	Yiwen ZHANG, Manchun CAI, Yonghao CHEN, Yi ZHU, Lifeng YAO. Multi-Scale Deepfake Detection Method with Fusion of Spatial Features [J]. Computer Engineering, 2024, 50(7): 240-250.
[15]	Huanyu LU, Yonghong ZHANG, Guangyi MA, Donglin XIE, Wei TIAN. Semi-Supervised Adversarial Learning-Based Water Body Extraction from Remote Sensing Images [J]. Computer Engineering, 2024, 50(7): 251-263.

Please choose a citation manager

Content to export