[1] SOOMRO K, ZAMIR A R, SHAH M.UCF101:a dataset of 101 human actions classes from videos in the wild[EB/OL].[2021-05-17].https://arxiv.org/abs/1212.0402. [2] KUEHNE H, JHUANG H, GARROTE E, et al.HMDB:a large video database for human motion recognition[C]//Proceedings of International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2011:2556-2563. [3] HEILBRON F C, ESCORCIA V, GHANEM B, et al.ActivityNet:a large-scale video benchmark for human activity understanding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:961-970. [4] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778. [5] KARPATHY A, TODERICI G, SHETTY S, et al.Large-scale video classification with convolutional neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:1725-1732. [6] TRAN D, BOURDEV L, FERGUS R, et al.Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:4489-4497. [7] CARREIRA J, ZISSERMAN A.Quo Vadis, action recognition? A new model and the Kinetics dataset[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:4724-4733. [8] 杨曙光.一种改进的深度学习视频分类方法[J].现代计算机, 2017(8):66-69. YANG S G.An improved video classification method of deep learning[J].Modern Computer, 2017(8):66-69.(in Chinese) [9] 廖小东, 贾晓霞.基于改进型C3D神经网络的动作识别技术[J].计算机与现代化, 2019(3):32-38. LIAO X D, JIA X X.Action recognition technology based on improved C3D neural network[J].Computer and Modernization, 2019(3):32-38.(in Chinese) [10] 王倩, 孙宪坤, 范冬艳.基于深度学习的时空特征融合人体动作识别[J].传感器与微系统, 2020, 39(10):35-38. WANG Q, SUN X K, FAN D Y.Fusion of spatio-temporal features based on deep learning for human action recognition[J].Transducer and Microsystem Technologies, 2020, 39(10):35-38.(in Chinese) [11] 李钊光.基于深度学习和迁移学习的体育视频分类研究[J].电子测量技术, 2020, 43(18):21-25. LI Z G.Research on sports video classification based on deep learning and transfer learning[J].Electronic Measurement Technology, 2020, 43(18):21-25.(in Chinese) [12] HARA K, KATAOKA H, SATOH Y.Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:6546-6555. [13] HE K M, ZHANG X Y, REN S Q, et al.Identity Mappings in Deep Residual Networks[M].Berlin, Germany:Springer, 2016. [14] XIE S N, GIRSHICK R, DOLLÁR P, et al.Aggregated residual transformations for deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:5987-5995. [15] HUANG G, LIU Z, VAN DER MAATEN L, et al.Densely connected convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:2261-2269. [16] 陈意, 黄山.基于改进NeXtVLAD的视频分类[J].计算机工程与设计, 2021, 42(3):749-754. CHEN Y, HUANG S.Video classification based on improved NeXtVLAD[J].Computer Engineering and Design, 2021, 42(3):749-754.(in Chinese) [17] TRAN D, BOURDEV L, FERGUS R, et al.Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:4489-4497. [18] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2021-05-17].https://arxiv.org/abs/1409.1556. [19] 李梦洁, 董峦.基于PyTorch的机器翻译算法的实现[J].计算机技术与发展, 2018, 28(10):160-163, 167. LI M J, DONG L.Implementation of machine translation algorithm based on PyTorch[J].Computer Technology and Development, 2018, 28(10):160-163, 167.(in Chinese) [20] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141. [21] BAHDANAU D, CHO K, BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2021-05-17].http://aps.arxiv.org/abs/1409.0473v2. [22] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90. [23] BARTZ C, HEROLD T, YANG H, et al.Language identification using deep convolutional recurrent neural networks[M].Berlin, Germany:Springer, 2017. [24] SIMONYAN K, ZISSERMAN A.Two-stream convolutional networks for action recognition in videos[EB/OL].[2021-05-17].https://arxiv.org/abs/1406.2199. [25] NG J Y H, HAUSKNECHT M, VIJAYANARASIMHAN S, et al.Beyond short snippets:deep networks for video classification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:4694-4702. [26] 智洪欣, 于洪涛, 李邵梅.基于时空域深度特征两级编码融合的视频分类[J].计算机应用研究, 2018, 35(3):926-929. ZHI H X, YU H T, LI S M.Video classification based on cascaded encoding fusion of temporal and spatial deep features[J].Application Research of Computers, 2018, 35(3):926-929.(in Chinese) |