[1]HE Ning,CAO Jiaheng,SONG Lin.Scale space histogram of oriented gradients for human detection[C]//Proceedings of International Symposium on Information Science and Engineering.Washington D.C.,USA:IEEE Computer Society,2008:167-170.
[2]WANG Heng,KLSER A,SCHMID C,et al.Dense trajectories and motion boundary descriptors for action recognition[J].International Journal of Computer Vision,2013,103(1):60-79.
[3]JI Shuiwang,YANG Ming,YU Kai,et al.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):221-231.
[4]KARPATHY A,TODERICI G,SHETTY S,et al.Large-scale video classification with convolutional neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Computer Society,2014:1725-1732.
[5]SIMONYAN K,ZISSERMAN A.Two-stream convolutional networks for action recognition in videos[EB/OL].[2018-02-10].https://arxiv.org/pdf/1406.2199.pdf.
[6]WU Zuxuan,WANG Xi,JIANG Yugang,et al.Modeling spatial-temporal clues in a hybrid deep learning framework for video classification[C]//Proceedings of the 23rd ACM International Conference on Multimedia.New York,USA:ACM Press,2015:461-470.
[7]DONAHUE J,HENDRICKS L A,GUADARRAMA S,et al.Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:2625-2634.
[8]NGJ Y H,HAUSKNECHT M,VIJAYANARASIMHAN S,et al.Beyond short snippets:deep networks for video classification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:4694-4702.
[9]WANG Peng,CAO Yuanzhouhan,SHEN Chunhua,et al.Temporal pyramid pooling based convolutional neural network for action recognition[J].IEEE Transactions on Circuits and Systems for Video Technology,2017,27(12):2613-2622.
[10]ZHU Jiagang,ZOU Wei,ZHU Zheng.End-to-end video-level representation learning for action recognition [EB/OL].[2018-02-10].https://arxiv.org/pdf/1711.04161.pdf.
[11]WANG Yilin,WANG Suhang,TANG Jiliang,et al.Hierarchical attention network for action recognition in videos [EB/OL].[2018-02-10].https://arxiv.org/pdf/1607.06416.pdf.
[12]YAN Shiyang,SMITH J S,LU Wenjin,et al.CHAM:action recognition using convolutional hierarchical attention model[EB/OL].[2018-02-10].https://arxiv.org/pdf/1705.03146.pdf.
[13]YAN Shiyang,SMITH J S,LU Wenjin,et al.Hierarchical multi-scale attention networks for action recognition[J].Signal Processing Image Communication,2018,61:73-84.
[14]LEI Tao,ZHANG Yu.Training RNNs as fast as CNNs[EB/OL].[2018-02-10].https://arxiv.org/pdf/1709.02755v2.pdf.
[15]WANG Limin,XIONG Yuanjun,WANG Zhe,et al.Temporal segment networks:towards good practices for deep action recognition[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2016:20-36.
[16]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[EB/OL].[2018-02-10].https://arxiv.org/pdf/1502.03167.pdf.
[17]HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[18]ZAGORUYKO S,KOMODAKIS N.Paying more attention to attention:improving the performance of convolutional neural networks via attention transfer[EB/OL].[2018-02-10].https://arxiv.org/pdf/1612.03928.pdf.
[19]LIN Weiyao,MI Yang,WU Jianxin,et al.Action recognition with coarse-to-fine deep feature integration and asynchronous fusion[EB/OL].[2018-02-10].https://arxiv.org/pdf/1711.07430.pdf.
[20]KETKAR N.Deep learning with Python[M].Berkeley,USA:Apress,2017:195-208.
[21]THUMOS challenge:action recognition with a large number of classes [EB/OL].[2018-02-10].http://crcv.ucf.edu/ICCV13-Action-Workshop/.
[22]ZACH C,POCK T,BISCHOF H.A duality based approach for realtime tv-L1 optical flow[C]//Proceedings of Joint Pattern Recognition Symposium.Berlin,Germany:Springer,2007:214-223.
[23]BRADSKI G.The opencv library[J].Journal of Software Tools,2000,25:120-125.
[24]DENG Jia,DONG Wei,SOCHER R,et al.ImageNet:a large-scale hierarchical image database[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Washington D.C.,USA:IEEE Press,2009:248-255.
[25]WANG Heng,SCHMID C.Action recognition with improved trajectories[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Computer Society,2014:3551-3558.
[26]WANG Limin,QIAO Yu,TANG Xiaoou.MoFAP:a multi-level representation for action recognition [J].International Journal of Computer Vision,2016,119(3):254-271.
[27]WANG Limin,QIAO Yu,TANG Xiaoou.Action recognition with trajectory-pooled deep-convolutional descriptors[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:4305-4314.
[28]TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Computer Society,2015:4489-4497.
[29]DIBA A,SHARMA V,GOOL L V.Deep temporal linear encoding networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:1541-1550. |