Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2021, Vol. 47 ›› Issue (11): 283-291. doi: 10.19678/j.issn.1000-3428.0059314

• Development Research and Engineering Application • Previous Articles     Next Articles

Dynamic Gesture Recognition Model Based on 3D Convolutional Neural Network

XU Fang, HUANG Jun, CHEN Quan   

  1. School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2020-08-20 Revised:2020-11-06 Published:2020-11-13

基于3D卷积神经网络的动态手势识别模型

徐访, 黄俊, 陈权   

  1. 重庆邮电大学 通信与信息工程学院, 重庆 400065
  • 作者简介:徐访(1995-),男,硕士研究生,主研方向为深度学习、行为识别、图像处理;黄俊,教授;陈权,硕士研究生。
  • 基金资助:
    国家自然科学基金(61671095)。

Abstract: As an important way of human-computer interaction,gesture interaction and gesture recognition have become current research hotspots in virtual reality,remote control and other fields due to their flexibility and convenience.Aiming at the problem that the accuracy of gesture recognition on the gesture video without the logo frame is affected,a dynamic gesture recognition method with a hierarchical network structure is proposed.The method uses the gesture detection model as the first level network,and gesture classification The model is a second-level network,which completes the identification task step by step.At the same time,in order to avoid the completion of the task in stages and the large number of parameters in the 3D convolutional neural network,resulting in excessive model training or running time consumption,a method of splitting the 3D convolution kernel into time domain convolution and spatial domain convolution is proposed.Method to reduce the time consumption of the model.The experimental results show that under the premise of ensuring real-time performance,the recognition accuracy rate on the experimental data set EgoGesture reaches 93.35%,which is better than C3D,ResNeXt101,MTUT and other methods,which proves the effectiveness of the method proposed in the article.

Key words: dynamic gesture recognition, hierarchical structure, convolution kernel split, 3D Convolutional Neural Network(CNN), gesture detector

摘要: 在不带有标志帧的手势视频上进行动态手势识别,容易导致识别准确率下降。提出一种具有分级网络结构的动态手势识别模型。以手势检测模型为第1级网络,手势分类模型为第2级网络,分步完成识别任务。同时,将三维卷积核拆分为时间域和空间域卷积分阶段完成任务,解决三维卷积神经网络中因参数过多造成模型训练或运行时间过长的问题。实验结果表明,在保证实时性的前提下,该模型在EgoGesture数据集上的识别准确率高达93.35%,优于C3D、ResNeXt101、MTUT等模型。

关键词: 动态手势识别, 分级结构, 卷积核拆分, 3D卷积神经网络, 手势检测器

CLC Number: