Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

A Method for Estimating the Pose of Weakly Textured Workpieces Based on RGB Images

  

  • Published:2025-12-19

基于RGB图像的弱纹理工件位姿估计方法研究

Abstract: To solve the problem of low pose estimation accuracy caused by the lack of texture information on the surface of workpieces in industrial application scenarios, a pose estimation method for weakly textured workpieces based on RGB images is proposed. Firstly, an improved ResNeXt feature extraction network was utilized to obtain the feature information of workpieces. Dense connections were employed between convolutional blocks to reduce the loss of feature information during the transfer process. Grouped convolutional residual blocks were introduced to enhance the model's perception ability of multi channel spatial features, and an attention module was added before the residual connection to learn the weights of each channel and locate key regions. Then, the pose estimation problem was transformed, and a cascaded convolutional pose estimation network was used to obtain the pixel positions of key points and the directional vector field. Finally, the perspective projection transformation algorithm was used to solve the pose of the workpiece. To verify the effectiveness of the proposed method, a synthetic dataset containing 20 types of backgrounds and 20,000 images was constructed, covering scenarios with different occlusion levels, illumination conditions, and observation distances. Ablation experiments show that the ADD pass rate of the proposed method is increased by 27.2% to 88.5%, with a parameter count of 70.1M and an inference speed of 1.47 F/S. On the YCB-Video dataset, the proposed method achieves 89.2%, 95.6%, and 94.2% in the three metrics of ADD(-S), AUC of ADD-S, and AUC of ADD(-S), respectively. On the Linemod Occlusion dataset, the average ADD(-S) metric is 88.7%, which is significantly higher than that of mainstream models such as DOPE and RePose. Experimental results demonstrate that the proposed method exhibits superior pose estimation accuracy and generalization ability in complex environments such as weak texture, occlusion, and illumination changes.

摘要: 为解决工业应用场景中工件表面缺乏纹理信息导致位姿估计精度低的问题,提出了一种基于RGB图像的弱纹理工件位姿估计方法。首先利用基于改进的ResNeXt特征提取网络获取工件的特征信息,通过在卷积块之间使用密集连接减少传递过程中特征信息的损失,引入分组卷积残差块,增强模型对多通道空间特征的感知能力,并在残差连接前加入注意力模块,学习各通道权重以及定位关键区域;然后对位姿估计问题进行转化,通过级联式卷积位姿估计网络获取关键点的像素位置和方向向量场;最后利用透视投影变换算法对工件位姿进行求解。为验证方法有效性,构建包含20种背景、20000张图像的合成数据集,覆盖不同遮挡程度、光照条件与观测距离场景。消融实验表明,所提方法ADD通过率提升27.2%,达到88.5%,参数量为70.1M,推理速度为1.47 F/S。在YCB-Video数据集上,所提方法在ADD(-S)、AUC of ADD-S和AUC of ADD(-S)三项指标分别达到89.2%、95.6%和94.2%;在Linemod Occlusion数据集上平均ADD(-S)指标为88.7%,较DOPE、RePose等主流模型显著提升。实验结果证明所提方法在弱纹理、遮挡及光照变化等复杂环境下具有优越的位姿估计精度与泛化能力。