1 |
TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2014: 1653-1660.
|
2 |
LI S J, LIU Z Q, CHAN A B. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. International Journal of Computer Vision, 2015, 113(1): 19- 36.
doi: 10.1007/s11263-014-0767-8
|
3 |
ZHANG W Q, FANG J M, WANG X G, et al. EfficientPose: efficient human pose estimation with neural architecture search. Computational Visual Media, 2021, 7(3): 335- 347.
doi: 10.1007/s41095-021-0214-z
|
4 |
XIAO B, WU H P, WEI Y C. Simple baselines for human pose estimation and tracking[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 472-487.
|
5 |
ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2014: 3686-3693.
|
6 |
JOHNSON S, EVERINGHAM M. Combining discriminative appearance and segmentation cues for articulated human pose estimation[C]//Proceedings of the 12th International Conference on Computer Vision Workshops. Washington D. C., USA: IEEE Press, 2010: 405-412.
|
7 |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 740-755.
|
8 |
SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 5686-5696.
|
9 |
|
10 |
TAN M X, CHEN B, PANG R M, et al. MnasNet: platform-aware neural architecture search for mobile[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 2815-2823.
|
11 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 6000-6010.
|
12 |
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2022-05-11]. https://arxiv.org/abs/2010.11929.
|
13 |
|
14 |
LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2022: 9992-10002.
|
15 |
XIONG Z, WANG C, LI Y, et al. Swin-Pose: swin Transformer based human pose estimation[C]//Proceedings of the 5th International Conference on Multimedia Information Processing and Retrieval. Washington D. C., USA: IEEE Press, 2022: 1-10.
|
16 |
|
17 |
LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 11966-11976.
|
18 |
SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck Transformers for visual recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 16514-16524.
|
19 |
TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image Transformers & distillation through attention[EB/OL]. [2022-05-11]. https://arxiv.org/abs/2012.12877.
|
20 |
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2022-05-11]. https://arxiv.org/abs/1704.04861.
|
21 |
SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 4510-4520.
|
22 |
HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011- 2023.
doi: 10.1109/TPAMI.2019.2913372
|
23 |
WEI S H, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 4724-4732.
|
24 |
NEWELL A, YANG K Y, DENG J. Stacked Hourglass networks for human pose estimation[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 483-499.
|
25 |
CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7103-7112.
|
26 |
HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 386- 397.
doi: 10.1109/TPAMI.2018.2844175
|
27 |
YU C Q, XIAO B, GAO C X, et al. Lite-HRNet: a lightweight high-resolution network[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 10435-10445.
|
28 |
CAO Z, SIMON T, WEI S H, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 1302-1310.
|
29 |
CHENG B W, XIAO B, WANG J D, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 5385-5394.
|
30 |
YANG S, QUAN Z B, NIE M, et al. TransPose: keypoint localization via Transformer[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2022: 11782-11792.
|
31 |
LIN K, WANG L J, LIU Z C. End-to-end human pose and mesh reconstruction with Transformers[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 1954-1963.
|
32 |
ZHANG J L, TU Z G, YANG J Y, et al. MixSTE: seq2seq mixed spatio-temporal encoder for 3D human pose estimation in video[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 13222-13232.
|
33 |
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 770-778.
|
34 |
AZAD R, HEIDARI M, WU Y L, et al. Contextual attention network: Transformer meets U-net. Berlin, Germany: Springer, 2022.
|
35 |
XU Y F, ZHANG J, ZHANG Q M, et al. ViTPose: simple vision Transformer baselines for human pose estimation[EB/OL]. [2022-05-11]. https://arxiv.org/abs/2204.12484.
|
36 |
WANG Q L, WU B G, ZHU P F, et al. ECANet: efficient channel attention for deep convolutional neural networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 11531-11539.
|
37 |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 1-8.
|
38 |
HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 13708-13717.
|
39 |
GENG Z G, SUN K, XIAO B, et al. Bottom-up human pose estimation via disentangled keypoint regression[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 14671-14681.
|
40 |
MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNetV2: practical guidelines for efficient CNN architecture design[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 122-138.
|
41 |
CHEN Y P, DAI X Y, LIU M C, et al. Dynamic ReLU[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 351-367.
|
42 |
PAPANDREOU G, ZHU T, KANAZAWA N, et al. Towards accurate multi-person pose estimation in the wild[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 3711-3719.
|
43 |
FANG H S, XIE S Q, TAI Y W, et al. RMPE: regional multi-person pose estimation[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2353-2362.
|