Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

A High-Fidelity 3D Reconstruction Method for Cultural Relics Based on Segmentation Prior and 3DGS Qinghao LIANG1, Hongjuan GAO1,2*

  

  • Published:2026-06-22

基于分割先验与3DGS的文物高保真三维重建方法

Abstract: Cultural relic 3D reconstruction is an important technical support for the digital preservation, virtual exhibition, and digital restoration of cultural heritage. Compared with modeling approaches such as structured-light scanning and laser scanning, which rely on specialized equipment and controlled acquisition environments, multi-view image-based 3D reconstruction methods have the advantages of low acquisition cost, flexible operation, and low deployment requirements, making them more suitable for cultural relic digitization in museum exhibition spaces. However, images captured in real museum collection scenes are often affected by complex backgrounds, glass reflections, uneven illumination, local occlusions, and limited shooting viewpoints. As a result, the target relic is highly intertwined with display platforms, walls, and other background regions in image space. Although the original 3D Gaussian Splatting (3DGS) method can achieve efficient training and real-time rendering through explicit Gaussian primitives, it is mainly designed for complete scene modeling and lacks a semantic focusing mechanism for cultural relic subjects. Consequently, redundant background point clouds and non-target Gaussians are likely to participate in optimization, increasing GPU memory consumption, training time, and model size. In addition, abnormal elongation and artifacts may occur around object boundaries, affecting the stable representation of the geometric shape and texture details of cultural relics. To improve the accuracy and efficiency of cultural relic subject reconstruction in complex museum collection environments, a high-fidelity 3D reconstruction method based on segmentation priors and 3DGS is proposed, in which two-dimensional subject segmentation results are introduced into the 3D Gaussian modeling process. The Segment Anything Model is used to generate subject masks of cultural relics from multi-view images. Combined with camera poses and sparse point clouds estimated by SfM, 3D points are projected onto the corresponding mask planes. Points that consistently fall into background regions are removed according to multi-view semantic consistency, thereby obtaining cleaner and more compact subject point clouds from the initialization stage. During Gaussian optimization, a mask-guided constraint is introduced to restrict the color reconstruction loss to the cultural relic target region, enabling parameter updates to focus on the subject geometry, surface texture, and local details while reducing the interference of background regions in the optimization process. To address abnormal elongation of Gaussian ellipsoids caused by insufficient sampling and depth discontinuities near cultural relic contours, an edge pruning strategy based on geometric morphological constraints is designed. Morphologically abnormal Gaussian primitives near object boundaries are identified and removed according to the major-to-minor axis ratio, suppressing “black spike” artifacts and edge noise diffusion while enhancing the continuity, compactness, and visual stability of subject boundaries. Experimental results on public datasets, including Tanks&Temples, Mip-NeRF 360, LERF, and LLFF, as well as a self-built cultural relic dataset, demonstrate that the proposed method achieves favorable overall performance in reconstruction accuracy, structural consistency, and perceptual quality. On the public datasets, the average PSNR, SSIM, and LPIPS reach 32.99 dB, 0.977, and 0.026, respectively. On the self-built cultural relic dataset, the average PSNR, SSIM, and LPIPS reach 35.48 dB, 0.983, and 0.027, respectively. Compared with the original 3DGS and related methods, including LightGaussian, 3DGSR, 2DGS, Perceptual-GS, and FCGS, the proposed method produces clearer subject contours and more stable texture representations under complex background conditions. Resource consumption comparisons and ablation experiments show that segmentation prior-guided point cloud filtering and edge pruning can jointly reduce redundant background Gaussians and alleviate contour artifacts, while significantly lowering training costs without compromising reconstruction quality. Compared with the original 3DGS, the training time is reduced by approximately 60%, GPU memory consumption by approximately 40%, and model size by approximately 50%, providing a feasible solution for low-cost, efficient, and high-fidelity 3D reconstruction of museum cultural relics under uncontrolled acquisition conditions.

摘要: 文物三维重建是文化遗产数字化保护、虚拟展示与数字修复的重要技术支撑。相比结构光扫描、激光扫描等依赖专业设备和受控环境的建模方式,基于多视角图像的重建方法具有采集成本低、操作灵活、部署门槛低等特点,更适用于博物馆展陈空间中的文物数字化采集。真实馆藏场景通常存在背景复杂、玻璃反射、光照不均、局部遮挡和拍摄视角受限等问题,目标文物与展台、墙面及其他背景区域在图像中相互混杂。原始三维高斯泼溅(3D Gaussian Splatting,3DGS)虽能通过显式高斯核实现高效训练与实时渲染,但其主要面向完整场景建模,缺乏面向文物主体的语义聚焦机制,易使冗余背景点云和非目标高斯参与优化,进而增加显存占用、训练耗时和模型规模,并在主体边界处产生异常拉伸和伪影,影响文物几何形态与纹理细节的稳定表达。 为提升复杂馆藏环境下文物主体重建的准确性与效率,提出基于分割先验与3DGS的高保真三维重建方法,将二维主体分割结果引入三维高斯建模过程。利用Segment Anything Model生成多视角图像中的文物主体掩码,并结合SfM估计的相机位姿与稀疏点云,将三维点投影至对应视角的掩码平面,依据多视图语义一致性筛除稳定落入背景区域的点云,从初始化阶段获得更加纯净、紧凑的主体点云。高斯优化过程中引入掩码引导约束,将颜色重建损失限制于文物目标区域,使参数更新集中于主体几何结构、表面纹理和局部细节,降低背景区域对优化过程的干扰。面向文物轮廓处采样不足和深度不连续引起的高斯椭球异常拉伸问题,设计基于几何形态约束的边缘裁剪策略,通过长短轴比判定并删除边界附近形态异常的高斯核,抑制“黑刺”伪影和边缘噪声扩散,增强主体边界的连续性、紧致性和视觉稳定性。 在Tanks&Temples、Mip-NeRF 360、LERF、LLFF等公共数据集以及自建馆藏文物数据集上的实验结果表明,该方法在重建精度、结构一致性和感知质量方面均具有较好的综合性能。公共数据集上的平均PSNR为32.99 dB,平均SSIM为0.977,平均LPIPS为0.026;自建文物数据集上的平均PSNR为35.48 dB,平均SSIM为0.983,平均LPIPS为0.027。与原始3DGS及LightGaussian、3DGSR、2DGS、Perceptual-GS、FCGS等方法相比,该方法能够在复杂背景条件下获得更加清晰的主体轮廓和更加稳定的纹理表达。资源消耗对比与消融实验表明,分割先验引导的点云过滤和边缘裁剪能够协同减少背景冗余高斯并改善轮廓伪影,在保持重建质量的同时显著降低训练成本。相较于原始3DGS,训练时间缩短约60%,显存占用降低约40%,模型体积减少约50%,为非受控采集条件下馆藏文物的低成本、高效率和高保真三维重建提供了可行方案。