作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 226-232,239. doi: 10.19678/j.issn.1000-3428.0065262

• 图形图像处理 • 上一篇    下一篇

基于高分辨率网络的轻量型人体姿态估计研究

钟宝荣, 吴夏灵   

  1. 长江大学 计算机科学学院, 湖北 荆州 434000
  • 收稿日期:2022-07-18 修回日期:2022-09-05 发布日期:2022-09-21
  • 作者简介:钟宝荣(1963-),男,教授,主研方向为图形图像处理、机器学习;吴夏灵(通信作者),硕士研究生。
  • 基金资助:
    国家自然科学基金(62006028)。

Research on Lightweight Human Pose Estimation Based on High-Resolution Network

ZHONG Baorong, WU Xialing   

  1. College of Computer Science and Technology, Yangtze University, Jingzhou 434000, Hubei, China
  • Received:2022-07-18 Revised:2022-09-05 Published:2022-09-21

摘要: 现有人体姿态估计网络通常采用增加网络模型深度的方式来提高预测精度,但是导致网络模型的参数量以及运算复杂度增加。为此,在高分辨率网络的基础上提出一种融入Ghost模块、Sandglass模块以及注意力机制的轻量型人体姿态估计网络GSENet。参考基础残差模块Bottleneck以及Basicblock,将Bottleneck模块中的标准卷积替换为Ghost卷积,并且将Basicblock模块中的卷积替换为Sandglass模块,通过这种方式重新构建基础模块GSEneck以及GSEblock。在此基础上,加入注意力机制以保证网络的预测精度。实验结果表明,相比HRNet,GSENet在COCO数据集上的参数量和运算复杂度分别减少84.6%和76.1%,在MPII数据集上的参数量和运算复杂度降低84.6%和76.8%,在保持一定预测精度的情况下,GSENet网络模型能够有效地减少网络参数量并降低运算复杂度。

关键词: 人体姿态估计, 高分辨率网络, 轻量型网络, 注意力机制, 深度卷积神经网络

Abstract: The existing human pose estimation network improves the prediction accuracy by increasing the depth of the network model, which leads to an increase in the number of parameters and computational complexity of the model. Therefore, a lightweight human pose estimation network GSENet is proposed, based on a high-resolution network, integrating the Ghost module, Sandglass module, and attention mechanism.Referring to the basic residual modules Bottleneck and Basicblock, the standard convolution in Bottleneck is replaced by Ghost convolution, and the convolution in Basicblock is replaced by the Sandglass module.The basic modules GSEneck and GSEblock are rebuilt so as to reduce the number of parameters and the complexity of the calculation. An attention mechanism is added to ensure the prediction accuracy of the network.The experimental results show that compared with HRNet, the number of parameters and computational complexity of GSENet are reduced by 84.6% and 76.1%, respectively, on the COCO dataset and by 84.6% and 76.8%, respectively, on the MPII dataset.The GSENet network model can effectively reduce the number of network parameters and computational complexity while maintaining a certain prediction accuracy.

Key words: human pose estimation, high-resolution network, lightweight network, attention mechanism, deep Convolutional Neural Network(CNN)

中图分类号: