基于旋转不变区域一致性的双视图点云重建方法

doi:10.19678/j.issn.1000-3428.0070753

计算机工程 ›› 2026, Vol. 52 ›› Issue (5): 239-249. doi: 10.19678/j.issn.1000-3428.0070753

• 计算机视觉与图形图像处理 • 上一篇下一篇

基于旋转不变区域一致性的双视图点云重建方法

高雨菲, 贾鑫*(), 黄张驰, 许志男, 霍鹏飞, 陆芷茵

天津理工大学工程训练中心, 天津 300384

收稿日期:2024-12-26 修回日期:2025-03-09 出版日期:2026-05-15 发布日期:2025-04-21
通讯作者: 贾鑫
作者简介:
高雨菲, 女, 学士, 主研方向为计算机视觉、三维重建、点云处理
贾鑫(CCF会员、通信作者), 讲师、博士、硕士生导师
黄张弛, 学士
许志男, 学士
霍鹏飞, 学士
陆芷茵, 学士
基金资助:
国家自然科学基金青年基金(62302335); 天津市大学生创新创业项目(202310060026)

Dual-View Point Cloud Reconstruction Method Based on Rotation-Invariant Regional Consistency

GAO Yufei, JIA Xin*(), HUANG Zhangchi, XU Zhinan, HUO Pengfei, LU Zhiyin

Engineering Training Center, Tianjin University of Technology, Tianjin 300384, China

Received:2024-12-26 Revised:2025-03-09 Online:2026-05-15 Published:2025-04-21
Contact: JIA Xin

摘要/Abstract

摘要：

多视图三维重建旨在通过多张二维图像恢复给定对象的三维形状。然而, 现有方法忽略了学习对象的旋转不变性以及区域一致性, 难以准确聚合多视图特征, 造成重建结果细节丢失。为了解决该问题, 提出一种基于旋转不变区域一致性的双视图点云重建方法(DPR2)。DPR2以两张RGB图像作为输入, 在探索对象区域旋转不变性的基础上, 学习跨视图对象的区域一致性, 促进多视图特征聚合, 并重建给定对象的精细点云。在编码阶段, 首先引入点云初始化网络, 为每个视图初始化一个粗糙点云; 其次, 提出区域级旋转不变特征提取网络, 通过计算两点之间的欧氏距离来捕捉粗糙点云不同区域的旋转不变特征。在解码阶段, 设计双阶段交叉注意力机制, 它可以构建跨视图点云的高质量区域一致性, 从而准确实现多视图特征聚合。另外, 设计一种点云细化网络, 利用被聚合的特征, 将粗糙点云细化为具有细粒度细节和光滑表面的点云。在ShapeNet和Pix3D数据集上的大量实验结果表明, DPR2的重建性能优于现有先进方法, 与最新方法P2M++、MVP2M++相比, DPR2的倒角距离(CD)分别改善了23.62%和9.06%。

关键词: 多视图三维重建, 点云, 区域一致性, 旋转不变性, 特征聚合

Abstract:

Multi-view 3D reconstruction aims to recover the 3D shape of a given object from multiple 2D images. However, existing methods neglect to learn the rotational invariance and regional consistency of objects, making it difficult to accurately aggregate multi-view features, resulting in the loss of reconstruction details. To address this issue, this study first proposes Dual-view Point cloud reconstruction based on Rotation-invariant Regional consistency (DPR2). DPR2 takes two RGB images as input, explores the rotational invariance of object regions, learns the regional consistency of objects across views, promotes the aggregation of multi-view features, and reconstructs the fine point cloud of the given object. In the encoding stage, a point-cloud initialization network is first introduced to initialize a coarse point cloud for each view. The study also proposes a region-level rotational invariant feature extraction network that captures the rotational invariant features of different regions of the coarse point cloud by calculating the Euclidean distance between two points. In the decoding stage, a two-stage cross-attention mechanism is designed to construct high-quality regional consistency of cross-view point clouds, thereby accurately achieving multi-view feature aggregation. Additionally, a point-cloud refinement network is designed that utilizes aggregated features to refine the coarse point cloud into one with fine-grained details and smooth surfaces. Extensive experimental results on the ShapeNet and Pix3D datasets show that DPR2 outperforms existing state-of-the-art methods. Compared with the latest methods, P2M++ and MVP2M++, DPR2 improves the Chamfer Distance (CD) by 23.62% and 9.06%, respectively.

Key words: multi-view 3D reconstruction, point cloud, regional consistency, rotation invariance, feature aggregation

高雨菲, 贾鑫, 黄张驰, 许志男, 霍鹏飞, 陆芷茵. 基于旋转不变区域一致性的双视图点云重建方法[J]. 计算机工程, 2026, 52(5): 239-249.

GAO Yufei, JIA Xin, HUANG Zhangchi, XU Zhinan, HUO Pengfei, LU Zhiyin. Dual-View Point Cloud Reconstruction Method Based on Rotation-Invariant Regional Consistency[J]. Computer Engineering, 2026, 52(5): 239-249.

https://www.ecice06.com/CN/Y2026/V52/I5/239

图/表 11

图1 DPR2的整体框架

Fig.1 The overall framework of DPR2

图2 点云第t个区域的旋转不变特征提取流程

Fig.2 The flowchart of extracting rotation-invariant features from t-th region of point cloud

图3 双阶段交叉注意力机制结构

Fig.3 Two-stage cross-attention mechanism structure

图4 点云细化网络结构

Fig.4 Point cloud refinement network structure

图5 ShapeNet数据集上的三维重建方法定性比较结果

Fig.5 Qualitative comparison results of 3D reconstruction methods on the ShapeNet dataset

图6 Pix3D数据集上的三维重建方法定性比较结果

Fig.6 Qualitative comparison results of 3D reconstruction methods on the Pix3D dataset

图7 完整模型与4种变体的定性比较结果

Fig.7 Qualitative comparison results between the complete model and its four variants

参考文献 27

1	CHANG A X, FUNKHOUSER T, GUIBAS L, et al. ShapeNet: an information-rich 3D model repository[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1512.03012.
2	SUN X Y, WU J J, ZHANG X M, et al. Pix3D: dataset and methods for single-image 3D shape modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 2974-2983.
3	刘姝, 张宇峰, 王科选, 等. 基于可形变规范化点云的三维人脸重建. 小型微型计算机系统, 2025, 46 (12): 2927- 2933.
	LIU S , ZHANG Y F , WANG K X , et al. 3D face reconstruction based on deformable normalized point clouds. Journal of Chinese Computer Systems, 2025, 46 (12): 2927- 2933.
4	谢帅康, 熊风光, 朱新杰, 等. 基于空间可变形Transformer的三维点云配准方法. 计算机工程, 2024, 50 (3): 224- 232. doi: 10.19678/j.issn.1000-3428.0067566
	XIE S K , XIONG F G , ZHU X J , et al. Three-dimensional point cloud registration method based on spatial deformable Transformer. Computer Engineering, 2024, 50 (3): 224- 232. doi: 10.19678/j.issn.1000-3428.0067566
5	WU J H , WYMAN O , TANG Y D , et al. Multi-view 3D reconstruction based on deep learning: a survey and comparison of methods. Neurocomputing, 2024, 582, 127553.
6	WANG C H , REZA M A , VATS V , et al. Deep learning-based 3D reconstruction from multiple images: a survey. Neurocomputing, 2024, 597, 128018.
7	CHOY C B, XU D F, GWAK J, et al. 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1604.00449.
8	KAR A, HANE C, MALIK J. Learning a multi-view stereo machine[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1708.05375.
9	YAO Y, LUO Z X, LI S W, et al. Recurrent MVSNet for high-resolution multi-view stereo depth inference[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2020: 5520-5529.
10	HUANG P H, MATZEN K, KOPF J, et al. DeepMVS: learning multi-view stereopsis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 2821-2830.
11	WANG M, WANG L J, FANG Y. 3DensiNet: a robust neural network architecture towards 3D volumetric object prediction from 2D image[C]//Proceedings of the 25th ACM International Conference on Multimedia. New York, USA: ACM Press, 2017: 961-969.
12	SHI Z, MENG Z, XING Y R, et al. 3D-RETR: end-to-end single and multi-view 3D reconstruction with transformers[EB/OL]. [2024-06-05]. https://arxiv.org/abs/2110.08861.
13	YANG B , WANG S , MARKHAM A , et al. Robust attentional aggregation of deep feature sets for multi-view 3D reconstruction. International Journal of Computer Vision, 2020, 128 (1): 53- 73. doi: 10.1007/s11263-019-01217-w
14	JIA X , YANG S R , WANG Y B , et al. Dual-view 3D reconstruction via learning correspondence and dependency of point cloud regions. IEEE Transactions on Image Processing, 2022, 31, 6831- 6846. doi: 10.1109/TIP.2022.3215024
15	ZHU Z W, YANG L Y, LI N, et al. UMIFormer: mining the correlations between similar tokens for multi-view 3D reconstruction[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2024: 18180-18189.
16	JIA X , YANG S R , PENG Y X , et al. DV-Net: dual-view network for 3D reconstruction by fusing multiple sets of gated control point clouds. Pattern Recognition Letters, 2020, 131, 376- 382. doi: 10.1016/j.patrec.2020.02.001
17	WEN C, ZHANG Y D, LI Z W, et al. Pixel2Mesh++: multi-view 3D mesh generation via deformation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2019: 1042-1051.
18	WEN C , ZHANG Y D , CAO C J , et al. Pixel2Mesh++: 3D mesh generation and refinement from multi-view images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (2): 2166- 2180. doi: 10.1109/TPAMI.2022.3169735
19	CHEN R S , YIN X , YANG Y C , et al. Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images. The Visual Computer, 2023, 39 (10): 5153- 5166.
20	YANG L Y, ZHU Z W, NONG X L J, et al. Long-range grouping transformer for multi-view 3D reconstruction[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2024: 18211-18221.
21	FAN H Q, SU H, GUIBAS L. A point set generation network for 3D object reconstruction from a single image[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2017: 2463-2471.
22	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1706.03762.
23	HUANG L, WANG W M, CHEN J, et al. Attention on attention for image captioning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2019: 4633-4642.
24	YANG Y Q, FENG C, SHEN Y R, et al. FoldingNet: point cloud auto-encoder via deep grid deformation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 206-215.
25	QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1706.02413.
26	ZHANG Z Y, HUA B S, ROSEN D W, et al. Rotation invariant convolutions for 3D point clouds deep learning[C]//Proceedings of International Conference on 3D Vision (3DV). Washington D.C., USA: IEEE Press, 2019: 204-213.
27	刘彦红, 杨秋翔, 胡帅. 基于特征差异的多尺度特征融合去雾网络研究. 计算机工程, 2024, 50 (4): 247- 257. doi: 10.19678/j.issn.1000-3428.0068583
	LIU Y H , YANG Q X , HU S . Research on multi-scale feature fusion dehazing network based on feature differences. Computer Engineering, 2024, 50 (4): 247- 257. doi: 10.19678/j.issn.1000-3428.0068583

[1]	陈晓雷, 王荣. 多分支多尺度点云补全网络[J]. 计算机工程, 2025, 51(8): 330-340.
[2]	孟波, 史旭华, 张彬. 基于双分支卷积和深度插值的点云表面重建[J]. 计算机工程, 2025, 51(7): 119-126.
[3]	陈思帆, 杨家志, 黄琳, 吕志玮, 沈露. 融合可变形核和自注意力的点云分类分割边卷积网络[J]. 计算机工程, 2025, 51(6): 146-154.
[4]	张天旭, 黄慧, 黄丙仓, 马燕, 徐傲, 李晓艳, 周孝雯, 刘之之. 基于多尺度聚合与高分辨率增强的CTA脑血管分割模型[J]. 计算机工程, 2025, 51(4): 37-46.
[5]	韩先帅, 梁斯昕, 王凡, 李巍, 边昂, 张建州. 基于统计跳变回归分析的点云法向量估计[J]. 计算机工程, 2025, 51(11): 215-225.
[6]	张玉鑫, 张雷, 欧冬秀. 面向磁浮轨道的多源点云数据的混合滤波方法[J]. 计算机工程, 2024, 50(9): 54-62.
[7]	李维刚, 厉许昌, 田志强, 李金灵. 基于自蒸馏框架的点云分类及其鲁棒性研究[J]. 计算机工程, 2024, 50(9): 72-81.
[8]	蔡俊民, 梁正友, 孙宇, 陈子奥. 基于可变形三维图卷积的轻量级点云分类研究[J]. 计算机工程, 2024, 50(9): 255-265.
[9]	张锡英, 孙守东, 于海浩, 边继龙. 基于空间传播的多视图三维重建[J]. 计算机工程, 2024, 50(7): 293-302.
[10]	时志鹏, 冯肖维, 赵一平. 改进邻域漂移的多假设检验点云降噪[J]. 计算机工程, 2024, 50(6): 276-286.
[11]	叶智奇, 章国宝, 朱宏伟. 一种面向室内动态行人场景的激光SLAM算法[J]. 计算机工程, 2024, 50(6): 208-217.
[12]	周秦源, 邓越平, 张磊, 张陈, 卢日荣, 胡贤哲. 融合光流与多视角几何的动态视觉SLAM系统[J]. 计算机工程, 2024, 50(5): 250-259.
[13]	徐守坤, 张路军, 石林, 刘毅. 意图注意力引导的小样本3D点云目标检测[J]. 计算机工程, 2024, 50(12): 288-295.
[14]	李沼洁, 朱恒亮, 毛国君, 杨鑫. 渐进式特征增强的弱监督显著性目标检测[J]. 计算机工程, 2024, 50(12): 233-244.
[15]	席建锐, 唐红梅, 梁春阳, 刘鑫. 基于改进隐函数的点云物体重建[J]. 计算机工程, 2023, 49(7): 214-222.

选择文件类型/文献管理软件名称

选择包含的内容

基于旋转不变区域一致性的双视图点云重建方法

Dual-View Point Cloud Reconstruction Method Based on Rotation-Invariant Regional Consistency

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于旋转不变区域一致性的双视图点云重建方法

Dual-View Point Cloud Reconstruction Method Based on Rotation-Invariant Regional Consistency

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价