作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (5): 306-313. doi: 10.19678/j.issn.1000-3428.0061304

• 开发研究与工程应用 • 上一篇    下一篇

基于轻量级图卷积的人体骨架动作识别方法

孙琪翔1, 何宁2, 张聪聪1, 刘圣杰1   

  1. 1. 北京联合大学 北京市信息服务工程重点实验室, 北京 100101;
    2. 北京联合大学 智慧城市学院, 北京 100101
  • 收稿日期:2021-03-29 修回日期:2021-05-16 发布日期:2021-05-26
  • 作者简介:孙琪翔(1994—),男,硕士研究生,主研方向为数字图像处理、计算机视觉;何宁(通信作者),教授、博士;张聪聪、刘圣杰,硕士研究生。
  • 基金资助:
    国家自然科学基金(61872042,61572077);北京市教委科技计划重点项目(KZ201911417048);北京市教委科技计划面上项目(KM202111417009);北京联合大学人才强校优选计划(BPHR2020AZ01,BPHR2020EZ01);北京联合大学科研项目(ZK50202001);北京联合大学研究生科研创新项目(YZ2020K001)。

Human Skeleton Action Recognition Method Based on Lightweight Graph Convolution

SUN Qixiang1, HE Ning2, ZHANG Congcong1, LIU Shengjie1   

  1. 1. Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China;
    2. Smart City College, Beijing Union University, Beijing 100101, China
  • Received:2021-03-29 Revised:2021-05-16 Published:2021-05-26

摘要: 视频中的人体动作识别在计算机视觉领域得到广泛关注,基于人体骨架的动作识别方法可以明确地表现人体动作,因此已逐渐成为该领域的重要研究方向之一。针对多数主流人体动作识别方法网络参数量大、计算复杂度高等问题,设计一种融合多流数据的轻量级图卷积网络,并将其应用于人体骨架动作识别任务。在数据预处理阶段,利用多流数据融合方法对4种特征数据流进行融合,通过一次训练就可得到最优结果,从而降低网络参数量。设计基于图卷积网络的非局部网络模块,以捕获图像的全局信息从而提高动作识别准确率。在此基础上,设计空间Ghost图卷积模块和时间Ghost图卷积模块,从网络结构上进一步降低网络参数量。在动作识别数据集NTU60 RGB+D和NTU120 RGB+D上进行实验,结果表明,与近年主流动作识别方法ST-GCN、2s AS-GCN、2s AGCN等相比,基于该轻量级图卷积网络的人体骨架动作识别方法在保持较低网络参数量的情况下能够取得较高的识别准确率。

关键词: 人体骨架动作识别, 数据融合, 图卷积, 非局部网络模块, Ghost网络

Abstract: Human action recognition in video has garnered extensive attention in the field of computer vision.The action recognition method based on human skeleton can clearly represent human motion;therefore, it has gradually become one of the most important research directions in the abovementioned field.To solve the issue of numerous network parameters and high computational complexity in most mainstream human action recognition methods, a lightweight graph convolution network integrating multistream data is designed and applied to human skeleton action recognition.In the data preprocessing stage, the multistream data fusion method is used to fuse four characteristic data streams.Optimal results can be obtained via one round of training;as such, the number of network parameters required is reduced.A non-local network module based on graph convolution network is designed to capture the global information of an image to improve the accuracy of action recognition.Subsequently, a space Ghost graph convolution module and a time Ghost graph convolution module are designed to further reduce the number of network parameters from the network structure.Experiments are performed on action recognition datasets NTU60 RGB+D and NTU120 RGB+D.Results show that compared with recent mainstream action recognition methods ST-GCN, 2s AS-GCN, and 2s AGCN, the human skeleton action recognition method based on the lightweight graph convolution network can achieve higher recognition accuracy while maintaining a lower number of network parameters.

Key words: human skeleton action recognition, data fusion, graph convolution, non-local network module, Ghost network

中图分类号: