Acceleration Approach for Neural Radiance Field in Dynamic 3D Human Reconstruction

doi:10.19678/j.issn.1000-3428.0069317

Abstract

Abstract:

This study proposes a novel acceleration method for the Neural Radiance Field (NeRF) in dynamic 3D human reconstruction to address the challenges of low training efficiency and high computational complexity in volume rendering. To improve the ability of the NeRF to represent detailed local features, multiresolution hash encoding is used as positional encoding, which increases the NeRF′s convergence speed. In addition, a shallow network is designed to estimate the volume density of the NeRF. An opacity loss function is proposed to optimize the network using the human alpha map output obtained by PP-Matting. The proposed density estimation network is used to compute the transmittance distribution along the camera rays during volume rendering. The importance sampling strategy for volume rendering is then implemented by inversely sampling the transmittance distribution, which reduces the number of unnecessary sampling points and improves the volume rendering′s computational efficiency. Furthermore, precise human foreground masks are generated by binarizing human alpha maps, which enhances the quality of the reconstructed datasets. Extensive experiments demonstrate that the combination of multiresolution hash encoding and importance sampling strategy improves the reconstruction speed on the ZJU-MoCap and SHTU-MoCap datasets by 17.7%, 9.5%, and 37.5%, respectively, compared to the Neural Body, HumanNeRF, and MonoHuman, while also achieving higher reconstruction accuracy. The use of binarized PP-Matting increases the accuracy of human masks to over 96%.

Key words: 3D human reconstruction, Neural Radiance Field (NeRF), volume rendering acceleration, human mask extraction, positional encoding

摘要：

针对动态三维人体重建场景下神经辐射场训练效率低和体渲染计算复杂度高的问题，提出一种神经辐射场(NeRF)加速方法。引入多分辨率哈希编码作为位置特征编码，提高神经辐射场的局部细节特征表示能力，加快模型收敛；设计体密度估计网络，添加不透明度损失函数，结合PP-Matting方法输出的人体透明度图优化体密度估计网络，通过估计体渲染过程中相机射线上透射率分布，结合逆变换采样实现体渲染重要性采样，减少无效采样点，提高体渲染计算效率；通过二值化透明度图获得高精度人体前景掩码，提高人体重建数据集质量。实验结果表明，引入多分辨率哈希编码和体渲染重要性采样策略后，该方法在ZJU-MoCap和SHTU-MoCap数据集上重建速度相较Neural Body、HumanNeRF和MonoHuman等人体重建方法提高17.7%、9.5%和37.5%，且重建精度更高，通过PP-Matting方法配合二值化操作将人体掩码提取精度提升至96%以上。

关键词: 三维人体重建, 神经辐射场, 体渲染加速, 人体掩码提取, 位置特征编码

XIAO Yilong, DENG Yiqin, CHEN Zhigang. Acceleration Approach for Neural Radiance Field in Dynamic 3D Human Reconstruction[J]. Computer Engineering, 2025, 51(8): 95-106.

肖祎龙, 邓伊琴, 陈志刚. 面向动态三维人体重建的神经辐射场加速方法[J]. 计算机工程, 2025, 51(8): 95-106.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069317

https://www.ecice06.com/EN/Y2025/V51/I8/95

Figures/Tables 11

Fig.1 Process diagram of the proposed 3D human reconstruction method

Fig.2 Illustration of 2D multi-resolution hash encoding

Fig.3 Illustration of the importance sampling points generation

Fig.4 Output results of PP-HumanSeg and PP-Matting

Fig.5 Effects and errors of different human mask extraction methods

Fig.6 Effects of different dynamic human reconstruction methods

Fig.7 The human body reconstruction effects under different volume rendering sampling strategies

References 33

1	CHEN L , PENG S D , ZHOU X W . Towards efficient and photorealistic 3D human reconstruction: a brief survey. Visual Informatics, 2021, 5 (4): 11- 19. doi: 10.1016/j.visinf.2021.10.003
2	黄千芃, 刘骊, 付晓东, 等. 单视角三维人体重建的着装特征学习. 中国图象图形学报, 2024, 29 (9): 2610- 2624.
	HUANG Q P , LIU L , FU X D , et al. Clothed feature learning for single-view 3D human reconstruction. Journal of Image and Graphics, 2024, 29 (9): 2610- 2624.
3	谢欢, 刘纯平, 季怡. 基于单-多视图优化的足球球员三维姿态和体型估计. 计算机工程, 2024, 50 (3): 200- 207. doi: 10.19678/j.issn.1000-3428.0067480
	XIE H , LIU C P , JI Y . Three-dimensional pose and shape estimation of soccer players based single- and on multi-view optimization. Computer Engineering, 2024, 50 (3): 200- 207. doi: 10.19678/j.issn.1000-3428.0067480
4	李伟伟, 王丽妍, 傅博, 等. 基于多模态融合的深度神经网络图像复原方法. 吉林大学学报(理学版), 2024, 62 (2): 391- 398.
	LI W W , WANG L Y , FU B . Deep neural network image restoration method based on multimodal fusion. Journal of Jilin University (Science Edition), 2024, 62 (2): 391- 398.
5	MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]//Proceedings of ECCV 2020. Berlin, Germany: Springer International Publishing, 2020: 405-421.
6	REBAIN D, JIANG W, YAZDANI S, et al. DeRF: decomposed radiance fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE Press, 2021: 14153-14161.
7	MVLLER T , EVANS A , SCHIED C , et al. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 2022, 41 (4): 1- 15. URL
8	GARBIN S J, KOWALSKI M, JOHNSON M, et al. FastNeRF: high-fidelity neural rendering at 200FPS[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE Press, 2021: 14346-14355.
9	景维鹏, 王源锋, 李超. 基于锥形追踪和网络分解的NeRF三维重建方法. 计算机工程, 2024, 50 (10): 334- 341. doi: 10.19678/j.issn.1000-3428.0068291
	JING W P , WANG Y F , LI C . NeRF 3D reconstruction method based on bone tracking and network decomposition. Computer Engineering, 2024, 50 (10): 334- 341. doi: 10.19678/j.issn.1000-3428.0068291
10	PENG S D, ZHANG Y Q, XU Y H, et al. Neural Body: implicit neural representations with structured latent codes for novel view synthesis of dynamic humans[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE Press, 2021: 9054-9063.
11	LOPER M , MAHMOOD N , ROMERO J , et al. SMPL: a skinned multi-person linear model. ACM Transactions on Graphics, 2015, 34 (6): 851- 866. URL
12	WENG C Y, CURLESS B, SRINIVASAN P P, et al. HumanNeRF: free-viewpoint rendering of moving people from monocular video[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 16210-16220.
13	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Itlay: IEEE Press, 2017: 2961-2969.
14	CHU L T, LIU Y, WU Z W, et al. PP-HumanSeg: connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). Waikoloa, USA: IEEE Press, 2022: 202-209.
15	CHEN G W, LIU Y, WANG J, et al. PP-Matting: high-accuracy natural image matting[EB/OL]. [2024-01-27]. https://arxiv.org/abs/2204.09433v1.
16	PUMAROLA A, CORONA E, PONS-MOLL G, et al. D-NeRF: neural radiance fields for dynamic scenes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE Press, 2021: 10318-10327.
17	PARK K , SINHA U , HEDMAN P , et al. HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields. ACM Transactions on Graphics (TOG), 2021, 40 (6): 1- 12. URL
18	XU H , ALLDIECK T , SMINCHISESCU C . H-NeRF: neural radiance fields for rendering and temporal reconstruction of humans in motion. Advances in Neural Information Processing Systems, 2021, 34, 14955- 14966. URL
19	SU S Y , YU F , ZOLLHÖFER M , et al. A-NeRF: articulated neural radiance fields for learning human shape, appearance, and pose. Advances in Neural Information Processing Systems, 2021, 34, 12278- 12291. URL
20	YU Z M, CHENG W, LIU X, et al. MonoHuman: animatable human neural field from monocular video[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE Press, 2023: 16943-16953.
21	郑清芳. 神经辐射场加速技术综述. 中兴通讯技术, 2023, 29 (2): 79- 86.
	ZHEN Q F . Survey of neural radiance field acceleration technologies. ZTE Technology Journal, 2023, 29 (2): 79- 86.
22	DREBIN R A , CARPENTER L , HANRAHAN P . Volume rendering. ACM SIGGRAPH Computer Graphics, 1988, 22 (4): 65- 74.
23	TANCIK M , SRINIVASAN P P , MILDENHALL B , et al. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 2020, 33, 7537- 7547. URL
24	SITZMANN V , MARTEL J N P , BERGMAN A W , et al. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems, 2020, 33, 7462- 7473. URL
25	LOMBARDI S , SIMON T , SARAGIH J , et al. Neural volumes. ACM Transactions on Graphics, 2019, 38 (4): 1- 14.
26	JACOBSON A, GINGOLD Y. Skinning: real-time shape deformation[C]//Proceedings of the SIGGRAPH Asia 2014 Courses. New York, USA: ACM Press, 2014: 19.
27	JERUZALSKI T, LEVIN D I W, JACOBSON A, et al. NiLBS: neural inverse linear blend skinning[EB/OL]. [2024-01-27]. https://arxiv.org/abs/2004.05980v1.
28	WENG C Y, CURLESS B, KEMELMACHER-SHLIZERMAN I. Vid2Actor: free-viewpoint animatable person synthesis from video in the wild[EB/OL]. [2024-01-27]. https://arxiv.org/abs/2012.12884v1.
29	PHARR M , JAKOB W , HUMPHREYS G . Physically based rendering: from theory to implementation. Francisco, USA: Morgan Kaufmann Publishers Inc, 2016.
30	ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE Press, 2018: 586-595.
31	PECH-PACHECO J L, CRISTOBAL G, CHAMORRO-MARTINEZ J, et al. Diatom autofocusing in brightfield microscopy: a comparative study[C]//Proceedings of the 15th International Conference on Pattern Recognition. Washington D. C., USA: IEEE Press, 2000: 314-317.
32	ALLDIECK T, MAGNOR M, XU W P, et al. Video based reconstruction of 3D people models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE Press, 2018: 8387-8397.
33	ZHAO F Q, YANG W, ZHANG J K, et al. HumanNeRF: efficiently generated human radiance field from sparse inputs[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 7743-7753.

Please choose a citation manager

Content to export