Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Skeleton Action Recognition Based on Multi-Granularity Cross Attention

  

  • Published:2025-07-04

基于多粒度交叉注意力的骨架动作识别方法

Abstract: Skeleton based motion recognition method is attracting more and more attention because of its excellent performance. In skeleton action recognition task, coarse-grained feature is an important supplement to fine-grained feature, which can effectively improve the performance of action recognition method. However, the existing multi granularity skeleton action recognition methods have shortcomings, first, the constructed coarse-grained features do not accurately retain the structural information between local adjacent fine-grained joint points; second, they do not make good use of the global correlation between coarse-grained features for feature learning. To solve the above problems, when constructing coarse-grained joint points, the arithmetic mean and classical convolution operations are used to capture the position and structure information of local adjacent fine-grained joint points; The cross attention mechanism is used to capture the global correlation between coarse-grained and fine-grained features, which can better describe the part level movement trend and improve the representation ability and discrimination of coarse-grained features. This method is combined with a variety of skeleton motion recognition models, and experiments are carried out under multiple evaluation standards of NTU RGB+D and NTU RGB+D 120 motion recognition data sets. Experimental results show that the proposed method can extract and fuse skeleton motion features with different granularity, and significantly improve the classification performance of human skeleton motion recognition method.

摘要: 基于骨架的动作识别方法因其卓越的性能,正受到越来越多的关注。在骨架动作识别任务中,粗粒度特征是细粒度特征的重要补充,可有效提升动作识别方法的性能。现有的多粒度骨架动作识别方法存在不足,一是所构造的粗粒度特征没有精确保留局部相邻细粒度关节点之间的结构信息,二是没有很好地利用粗细粒度特征之间的全局依赖关系进行特征学习。针对以上问题,在构造粗粒度关节点时,分别使用算术平均和经典的卷积操作捕捉局部相邻细粒度关节点的位置和结构信息;使用交叉注意力机制捕捉粗细两种粒度特征之间的全局依赖关系,在特征融合的同时,更好地刻画了部位级运动趋势,提高了粗粒度特征表征能力和鉴别性。将所提方法与多种骨架动作识别模型相结合,并在NTU RGB+D和NTU RGB+D 120动作识别数据集的多个评测标准下进行实验。实验结果表明,所提方法能够提取并融合不同粒度的骨架动作特征,显著提升人体骨架动作识别方法的分类性能。