基于3D卷积神经网络的人体动作识别算法

doi:10.19678/j.issn.1000-3428.0048978

计算机工程 ›› 2019, Vol. 45 ›› Issue (1): 259-263. doi: 10.19678/j.issn.1000-3428.0048978

基于3D卷积神经网络的人体动作识别算法

张瑞^1,2,李其申²,储珺²

1.南昌航空大学信息工程学院,南昌 330063; 2.江西省图像处理与模式识别重点实验室,南昌 330063

收稿日期:2017-10-16 出版日期:2019-01-15 发布日期:2019-01-15
作者简介:张瑞(1993—),女,硕士研究生,主研方向为图像处理、模式识别;李其申,副教授、博士;储珺,教授、博士、博士生导师。
基金资助:
国家自然科学基金(61663031);江西省自然科学基金(20132BAB201046);南昌航空大学研究生创新专项资金(YC2016009)

Human Action Recognition Algorithm Based on 3D Convolution Neural Network

ZHANG Rui^1,2,LI Qishen²,CHU Jun²

1.School of Information Engineering,Nanchang Hangkong University,Nanchang 330063,China; 2.Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition,Nanchang 330063,China

Received:2017-10-16 Online:2019-01-15 Published:2019-01-15

摘要/Abstract

摘要：

由于人体动作的多样性、场景嘈杂、摄像机运动视角多变等特性,导致人体动作识别的难度增加。为此,基于3D卷积神经网络,提出一种新的人体动作识别算法。以连续的16帧视频为一组输入,采用视频图像的灰度、x方向梯度、y方向梯度、x方向光流、y方向光流做多通道处理,训练网络参数,经过5层3D卷积、5层3D池化增加提取特征中时间维度的动作信息,最终通过2层全连接与softmax分类器得到识别分类结果。在UCF101数据库上进行实验,结果表明,相比iDT、P-CNN、LRCN算法,该算法具有较高的识别准确率,且运行速度更快。

关键词: 人体动作识别, 多通道, 3D卷积, 3D池化, 时间维度

Abstract:

Human action diversity,scene noise,the camera motion angle changes and other factors increase the difficulty of human action recognition.This paper proposes a human action recognition algorithm based on 3D convolution neural network.Firstly,successive 16 frames of the video are divided into a group as the input.Secondly,the input data is multi-channel processed using the gray,gradient-x,gradient-y,optflow-x and optflow-y,which effectively trains the network parameters.Thirdly,the extracted features are obtained using 5-layer 3D convolution and 5-layer 3D pooling to increase time dimension information,Finally,the recognition results are obtained by two full connection layers and the softmax classifier.Experiment is made on the UCF101 database,and the results show that compared with iDT,P-CNN,LRCN algorithms,the proposed algorithm has a higher accuracy of human action recognition and a faster running speed.

Key words: human action recognition, multi-channel, 3D convolution, 3D pooling, time dimension

中图分类号:

TP391

张瑞,李其申,储珺. 基于3D卷积神经网络的人体动作识别算法[J]. 计算机工程, 2019, 45(1): 259-263.

ZHANG Rui,LI Qishen,CHU Jun. Human Action Recognition Algorithm Based on 3D Convolution Neural Network[J]. Computer Engineering, 2019, 45(1): 259-263.

http://www.ecice06.com/CN/Y2019/V45/I1/259

参考文献

［1］郑胤,陈权崎,章毓晋.深度学习及其在目标和行为识别中的新进展［J］.中国图象图形学报,2014,19(2):175-184.
［2］徐勤军,吴镇扬.视频序列中的行为识别研究进展［J］.电子测量与仪器学报,2014,28(4):343-351.
［3］雷庆,陈锻生,李绍滋.复杂场景下的人体行为识别研究新进展［J］.计算机科学,2014,41(12):1-7.
［4］杜友田,陈峰,徐文立,等.基于视觉的人的运动识别综述［J］.电子学报,2007,35(1):84-90.
［5］李岳云,许悦雷,马时平,等.深度卷积神经网络的显著性检测［J］.中国图象图形学报,2016,21(1):53-59.
［6］DONAHUE J,HENDRICKS L A,GUADARRAMAS,et al.Long-term recurrent convolutional networks for visual recognition and description［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,39(4):677-691.
［7］李瑞峰,王亮亮,王珂.人体动作行为识别综述［J］.模式识别与人工智能,2014,27(1):35-48.
［8］单言虎,张彰,黄凯奇.人的视觉行为识别研究回顾、现状及展望［J］.计算机研究与发展,2016,53(1):93-112.
［9］KARPATHY A,TODERICI G,SHETTY S,et al.Large-scale video classification with convolutional neural networks［C］//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Washington D.C.,USA:IEEE Press,2014:1725-1732.
［10］SIMONYAN K,ZISSERMAN A.Two-stream convolutional networks for action recognition in videos［J］.Neural Information Processing Systems,2014(1):568-576.
［11］CHERON G,LAPTEV I,SCHMID C.P-CNN:Posed-based CNN features for action recognition［C］//Proceedings of IEEE Interational Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2016:3218-3226.
［12］JI S,XU W,YANG M,et al.3D convolutional neural networks for human action recognition［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):221-231.
［13］CHAQUET J M,CARMONA E J,FERNANDEZ-CABALLERO A.A survey of video datasets for human action and activity recognition［J］.Computer Vision and Image Understanding,2013,117(6):633-659.
［14］SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition［EB/OL］.［2017-09-01］.http://cn.arxiv.org/pdf/1409.1556v6.
［15］徐渊,许晓亮,李才年,等.结合SVM分类器与HOG特征提取的行人检测［J］.计算机工程,2016,42(1):56-60.
［16］ZHANG B,WANG H.Encoding scale into fisher vector for human action recognition［C］//Proceedings of Visual Communications and Image Processing.Washington D.C.,USA:IEEE Press,2016:1-4.
［17］BEZAK P.Building recognition system based on deep learning［C］//Proceedings of International Conference on Artificial Intelligence and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:1-5.

[1]	郭可翔, 王衡军, 白祉旭. 融合多通道CNN与BiGRU的字词级文本错误检测模型[J]. 计算机工程, 2022, 48(9): 63-70.
[2]	马亚彤, 王松, 刘英芳. 融合多模态数据的人体动作识别方法研究[J]. 计算机工程, 2022, 48(9): 180-188.
[3]	张瑷涵, 刘翔, 石蕴玉, 刘思齐. 基于深度学习的双流程短视频分类方法[J]. 计算机工程, 2022, 48(7): 277-283.
[4]	陈可嘉, 刘惠. 基于改进BiGRU-CNN的中文文本分类方法[J]. 计算机工程, 2022, 48(5): 59-66,73.
[5]	武茜, 贾世杰. 基于多通道注意力机制的人脸替换鉴别[J]. 计算机工程, 2022, 48(2): 180-185,193.
[6]	谢树春, 陈志华, 盛斌. 增强细节的RGB‐IR多通道特征融合语义分割网络[J]. 计算机工程, 2022, 48(10): 230-237,244.
[7]	徐访, 黄俊, 陈权. 基于3D卷积神经网络的动态手势识别模型[J]. 计算机工程, 2021, 47(11): 283-291.
[8]	黄伟, 冯晶晶, 黄遥. 基于多通道极深卷积神经网络的图像超分辨率算法[J]. 计算机工程, 2020, 46(9): 242-247,253.
[9]	杨海清, 范琦. 基于时空分析的路口相似度计算方法[J]. 计算机工程, 2020, 46(4): 33-39.
[10]	张杰豪, 陈华杰, 姚勤炜, 侯新雨. 基于行为主体检测的视频行为快速检测[J]. 计算机工程, 2019, 45(12): 257-262.
[11]	刘莹莹,邱崧,孙力,周梅,徐伟. 基于多视角自步学习的人体动作识别方法[J]. 计算机工程, 2018, 44(2): 257-263.
[12]	赵涛,郭猛,顾亚浏,章阳. 基于FPGA的多通道数据采集控制器设计与实现[J]. 计算机工程, 2017, 43(6): 241-246.
[13]	曹晋其,蒋兴浩,孙锬锋. 基于训练图CNN特征的视频人体动作识别算法[J]. 计算机工程, 2017, 43(11): 234-238.
[14]	施隆照,郭冀闽. MIMO-OFDM系统中低功耗FFT/IFFT处理器设计[J]. 计算机工程, 2016, 42(7): 16-21.
[15]	盛丁,邢钱舰,马振国,赵备. 基于FC-SCSI的多通道高速信号采集与实时存储系统[J]. 计算机工程, 2015, 41(11): 120-125,130.

选择文件类型/文献管理软件名称

选择包含的内容

基于3D卷积神经网络的人体动作识别算法

Human Action Recognition Algorithm Based on 3D Convolution Neural Network

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于3D卷积神经网络的人体动作识别算法

Human Action Recognition Algorithm Based on 3D Convolution Neural Network

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价