基于改进YOLOv3的手势实时识别方法

doi:10.19678/j.issn.1000-3428.0054222

摘要/Abstract

摘要： 针对基于人工建模方式的手势识别方法准确率低、速度慢的问题，提出一种基于改进YOLOv3的静态手势实时识别方法。采用卷积神经网络YOLOv3模型，将通过Kinect设备采集的IR、Registration of RGB、RGB和Depth图像代替常用的RGB图像作为数据集，并融合四类图像的识别结果以提高识别准确率。采用k-means聚类算法对YOLOv3中的初始候选框参数进行优化，从而加快识别速度。在此基础上，利用迁移学习的方法对基础特征提取器进行改进，以缩短模型的训练时间。实验结果表明，该方法对流式视频静态手势的平均识别准确率为99.8%，识别速度高达52 FPS，模型训练时间为12 h，与Faster R-CNN、SSD、YOLOv2等深度学习方法相比，其识别精度更高，识别速度更快。

关键词: 手势识别, YOLOv3模型, Kinect设备, 聚类算法, 迁移学习

Abstract: The hand gesture recognition method based on artificial modeling has many problems such as low accuracy and slow speed.Therefore,this paper proposes a static hand gesture recognition method based on improved YOLOv3.By using the convolutional neural network YOLOv3 model,the commonly used RGB images are replaced by the IR,Registration of RGB,RGB and Depth images collected by Kinect equipment as dataset.The recognition results of these 4 types of images are fused to improve the recognition accuracy.The k-means clustering algorithm is used to optimize the initial candidate frame parameters in YOLOv3,so as to improve the recognition speed.On this basis,the transfer learning is used to improve the basic feature extractor to shorten the training time of the model.Experimental results show that for the recognition of static hand gestures in stream videos,the mean Average Precision(mAP) of the proposed method is 99.8% and the recognition speed is up to 52 FPS.The training time of the proposed model is 12 hours,and its recognition accuracy and speed is better than other deep learning methods such as Faster R-CNN,SSD and YOLOv2.

Key words: hand gesture recognition, YOLOv3 model, Kinect equipment, clustering algorithm, transfer learning

中图分类号:

TP391

张强, 张勇, 刘芝国, 周文军, 刘佳慧. 基于改进YOLOv3的手势实时识别方法[J]. 计算机工程, 2020, 46(3): 237-245,253.

ZHANG Qiang, ZHANG Yong, LIU Zhiguo, ZHOU Wenjun, LIU Jiahui. Real-time Hand Gesture Recognition Method Based on Improved YOLOv3[J]. Computer Engineering, 2020, 46(3): 237-245,253.

https://www.ecice06.com/CN/Y2020/V46/I3/237

图/表 14

20200321091419

20200321091529

20200321091532

20200321091535

20200321091538

20200321091541

20200321091544

20200321091547

20200321091550

20200321091553

20200321091557

20200321091600

20200321091603

20200321091606

参考文献

[1] YANG Jizheng,FENG Yun,BU Qirong,et al.A dynamic gesture recognition algorithm based on spatial angle sequence recursive model[J].Journal of Chinese Computer Systems,2017,38(11):2547-2552.(in Chinese) 杨纪争,冯筠,卜起荣,等.一种基于空间角度序列递归模型的动态手势识别算法[J].小型微型计算机系统,2017,38(11):2547-2552.
[2] GAO Zhe.Gesture recognition based on multi spatial feature fusion[J].Journal of Chinese Computer Systems,2016,37(7):1577-1582.(in Chinese) 高喆.多重空间特征融合的手势识别[J].小型微型计算机系统,2016,37(7):1577-1582.
[3] LIU Yun,ZHANG Lifeng,ZHANG Shujun.A hand gesture recognition method based on multi-feature fusion and template matching[J].Procedia Engineering,2012,29(4):1678-1684.
[4] LÜ Na,YANG Xiaohui,XU Tao.Sparse decomposition for data glove gesture recognition[C]//Proceedings of 2017 International Congress on Image and Signal Processing,BioMedical Engineering and Informatics.Washington D.C.,USA:IEEE Press,2017:1-5.
[5] DAI Yukun,ZHOU Zhiheng,CHEN Xi,et al.A novel method for simultaneous gesture segmentation an recognition based on HMM[C]//Proceedings of 2017 International Symposium on Intelligent Signal Processing and Communication Systems.Washington D.C.,USA:IEEE Press,2017:684-688.
[6] ADITHYA V,VINOD P R,GOPALAKRISHNAN U.Artificial neural network based method for Indian sign language recognition[C]//Proceedings of 2013 IEEE Conference on Informationand Communication Technologies.Washington D.C.,USA:IEEE Press,2013:1080-1085.
[7] VINH T Q,TRI N T.Hand gesture recognition based on depth image using Kinect sensor[C]//Proceedings of 2015 National Foundation for Science and Technology Development Conference on Information and Computer Science.Washington D.C.,USA:IEEE Press,2015:34-39.
[8] CÔTÉ-ALLARD U,FALL C L,CAMPEAU-LECOURS A,et al.Transfer learning for sEMG hand gestures recognition using convolutional neural networks[C]//Proceedings of 2017 IEEE International Conference on Systems,Man,and Cybernetics.Washington D.C.,USA:IEEE Press,2017:1663-1668.
[9] ZHANG Xun,CHEN Liang,HU Cheng,et al.A real-time static gesture recognition method based on deep learning[J].Modern Computer,2017(34):6-11.(in Chinese) 张勋,陈亮,胡诚,等.一种基于深度学习的静态手势实时识别方法[J].现代计算机,2017(34):6-11.
[10] REDMON J,FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2019-02-25].https://arxiv.org/abs/1804.02767v1.
[11] REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:6517-6525.
[12] TAO Jing,WANG Hongbo,ZHANG Xinyu,et al.An object detection system based on YOLO in traffic scene[C]//Proceedings of 2017 International Conference on Computer Science and Network Technology.Washington D.C.,USA:IEEE Press,2017:315-319.
[13] ROY S S,HAQUE A U,NEUBERT J.Automatic diagnosis of melanoma from dermoscopic image using real-time object detection[C]//Proceedings of the 52nd Annual Conference on Information Sciences and Systems.Washington D.C.,USA:IEEE Press,2018:1-5.
[14] DOMINIO F,DONADEO M,ZANUTTIGH P.Combining multiple depth-based descriptors for hand gesture recognition[J].Journal of Pattern Recognition Letters,2014(50):101-111.
[15] CAO Jie,ZHAO Xiulong,WANG Jinhua.Dynamic gesture recognition based on RGB-D information[J].Application Research of Computers,2018,35(7):2228-2232.(in Chinese) 曹洁,赵修龙,王进花.基于RGB-D信息的动态手势识别方法[J].计算机应用研究,2018,35(7):2228-2232.
[16] CHEN Lifu,WU Hong,CUI Xianliang,et al.Convolution neural network SAR image target recognition based on transfer learning[J].Chinese Space Science and Technology,2018,38(6):49-55.(in Chinese) 陈立福,武鸿,崔先亮,等.基于迁移学习的卷积神经网络SAR图像目标识别[J].中国空间科学技术,2018,38(6):49-55.
[17] WANG Hongxia,WANG Kun.Static gesture recognition method based on locking mechanism[J].Journal of Computer Applications,2016,36(7):1959-1964.(in Chinese) 王红霞,王坤.基于加锁机制的静态手势识别方法[J].计算机应用,2016,36(7):1959-1964.
[18] LANDAU M J,CHOO B Y,BELING P A.Simulating Kinect infrared and depth images[J].IEEE Transactions on Cybernetics,2016,46(12):3018-3031.
[19] JIANG Jiewen.Design of human computer interaction system of cooperative robot based on gesture recognition[D].Dalian:Dalian University of Technology,2019.(in Chinese) 姜杰文.基于手势识别的协作机器人人机交互系统设计[D].大连:大连理工大学,2019.
[20] HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:770-778.
[21] DONG Yingying,DENG Wanyu,LIU Guangda.Adaptive transfer learning based on score samples[J].Computer and Digital Engineering,2019,47(12):2989-2992,3153.(in Chinese) 董莹莹,邓万宇,刘光达.基于score样本选择的同构域适应迁移学习[J].计算机与数字工程,2019,47(12):2989-2992,3153.
[22] CAO P,ZHANG S,TANG J.Preprocessing-free gear fault diagnosis using small datasets withdeep convolutional neural network-based transfer learning[J].Journal of IEEE Access,2018(6):26241-26253.
[23] PUGEAULT N,BOWDEN R.Spelling it out:real-time ASL finger spelling recognition[C]//Proceedings of 2011 IEEE International Conference on Computer Vision Workshop.Washington D.C.,USA:IEEE Press,2011:1114-1119.
[24] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:779-788.
[25] REN S,HE K,GRISHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[26] KETCHEN D J,SHOOK C L.The application of cluster analysis in strategic management research:an analysis and critique[J].Journal of Strategic Management,1996,17(6):441-458.
[27] KAUR P,GOYAL M,LU J.Pricing analysis in online auctions using clustering and regression tree approach[C]//Proceedings of the 7th International Workshop on Agents and Data Mining Interation.Berlin,Germany:Springer,2011:248-257.
[28] LIN T L,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[EB/OL].[2019-02-20].https://rd.springer.com/chapter/10.1007/978-3-319-10602-1_48.

选择文件类型/文献管理软件名称

选择包含的内容