作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

轻量级高分辨率网络人体姿态估计算法研究

  • 发布日期:2024-04-26

The Research of Lightweight High-Resolution Network for Human Pose Estimation Algorithm

  • Published:2024-04-26

摘要: 人体姿态估计被广泛应用于运动健身、手势控制、无人超市、娱乐游戏等诸多领域,但姿态估计任务仍面临着诸多挑战。针对目前主流的人体姿态估计网络参数量大,计算复杂的问题,提出一种基于高分辨率网络的轻量级姿态估计网络(Lite-Pose Network,简称LitePose)。首先,采用Ghost卷积,降低特征提取网络的参数;其次,通过采用解耦的全连接(Decoupling Fully Connected Attention, DFC)注意力模块,可以更好地捕获远距离空间位置像素间的依赖关系,减少由于参数量下降导致提取特征的缺失,提高人体关键点回归的准确性;此外,设计了一个特征增强模块对骨干网络提取的特征进行进一步增强。最后,设计了一个新的坐标解码方法,降低热图解码过程的误差,提高关键点回归的准确率。将LitePose在人体关键点检测数据集COCO和MPII上进行了验证,与当前的主流方法进行对比,实验结果表明LitePose相比基线网络HRNet精度损失0.2%,但参数量不及基线网络的三分之一。在保证少量损失网络精度的同时,大幅降低了网络模型的参数量。

Abstract: Human pose estimation is widely used in many fields such as sports fitness, gesture control, unmanned supermarkets, entertainment games and many other fields. However, the pose estimation task still faces many challenges. In view of the current mainstream human pose estimation network with large parameters and complex calculations, we proposes a lightweight pose estimation network based on high-resolution network (Lite-Pose Network, referred to as LitePose). Firstly, Ghost convolution is used to reduce the parameters of feature extraction network; Secondly, by using decoupled fully connected attention (DFC) attention module, it can better capture the dependence relationship between pixels in far distance space position and reduce the feature extraction caused by the decrease of parameters. The accuracy of human pose keypoint regression is improved; in addition, a feature enhancement module is designed to further enhance the features extracted by the backbone network. Finally, a new coordinate decoding method is designed to reduce the error in heatmap decoding process and improve the accuracy of keypoint regression. LitePose was validated on the human critical point detection datasets COCO and MPII, and compared with current mainstream methods, and the experimental results show that LitePose loses 0.2% accuracy compared to the baseline network HRNet, but the number of parameters is less than one-third of the baseline network. On the premise of ensuring a small amount of accuracy loss, the parameter quantity of the model is greatly reduced.