1 |
邓淼磊, 高振东, 李磊, 等. 基于深度学习的人体行为识别综述. 计算机工程与应用, 2022, 58 (13): 14- 26.
|
|
DENG M L , GAO Z D , LI L , et al. Overview of human behavior recognition based on deep learning. Computer Engineering and Applications, 2022, 58 (13): 14- 26.
|
2 |
WANG H, SCHMID C. Action recognition with improved trajectories[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2013: 3551-3558.
|
3 |
KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2014: 1725-1732.
|
4 |
SIMONYAN K, ZISSERMAN A, SIMONYAN K, et al. Two-stream convolutional networks for action recognition in videos[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2014: 568-576.
|
5 |
XIE Z , ZHOU Y , WU K W , et al. Behavior recognition based on spatiotemporal attention LSTM. Journal of Computer Science, 2021, 44 (2): 261- 274.
|
6 |
TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2015: 4489-4497.
|
7 |
QIU Z F, YAO T, MEI T. Learning spatio-temporal representation with pseudo-3D residual networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 5534-5542.
|
8 |
DIBA A L, FAYYAZ M, SHARMA V, et al. Temporal 3D ConvNets: new architecture and transfer learning for video classification[EB/OL]. [2023-09-18]. https://arxiv.org/abs/1711.08200v1.
|
9 |
HARA K, KATAOKA H, SATOH Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 6546-6555.
|
10 |
CHAUDHARI S , MITHAL V , POLATKAN G , et al. An attentive survey of attention models. ACM Transactions on Intelligent Systems and Technology, 2021, 12 (5): 1- 32.
|
11 |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7132-7141.
|
12 |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 3-19.
|
13 |
CAO Y, XU J R, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 1971-1980.
|
14 |
WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 11531-11539.
|
15 |
|
16 |
|
17 |
CARVALHO S R, BERTAGNOLLI N M, FOLKMAN T, et al. A temporal bottleneck attention architecture for video action recognition: WO2021US59372[P]. 2022-05-19.
|
18 |
LI C H , ZHANG J , YAO J C . Streamer action recognition in live video with spatial-temporal attention and deep dictionary learning. Neurocomputing, 2021, 453, 383- 392.
doi: 10.1016/j.neucom.2020.07.148
|
19 |
GONG J , LUO C , LUO Q . Action recognition model based on attention mechanism and residual network. Electronic Measurement Technology, 2021, 44 (14): 111- 116.
|
20 |
|
21 |
|
22 |
ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 6848-6856.
|
23 |
LIN T Y , GOYAL P , GIRSHICK R , et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (2): 318- 327.
doi: 10.1109/TPAMI.2018.2858826
|
24 |
ZHOU B , LI J F . Human behavior recognition combined with object detection. Journal of Automation, 2020, 46 (9): 1961- 1970.
|
25 |
WANG L M , XIONG Y J , WANG Z , et al. Temporal segment networks for action recognition in videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41 (11): 2740- 2755.
|