[1] ZHENG C, WU W H, CHEN C, et al.Deep learning-based human pose estimation:a survey[EB/OL].[2021-09-05].https://arxiv.org/abs/2012.13392. [2] 冯晓月, 宋杰.二维人体姿态估计研究进展[J].计算机科学, 2020, 47(11):128-136. FENG X Y, SONG J.Research advance on 2D human pose estimation[J].Computer Science, 2020, 47(11):128-136.(in Chinese) [3] WEI S H, RAMAKRISHNA V, KANADE T, et al.Convolutional pose machines[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:4724-4732. [4] NEWELL A, YANG K, DENG J.Stacked hourglass networks for human pose estimation[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2016:483-499. [5] BULAT A, KOSSAIFI J, TZIMIROPOULOS G, et al.Toward fast and accurate human pose estimation via soft-gated skip connections[C]//Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition.Washington D.C., USA:IEEE Press, 2020:8-15. [6] FANG H S, XIE S Q, TAI Y W, et al.RMPE:regional multi-person pose estimation[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:2353-2362. [7] CHEN Y L, WANG Z C, PENG Y X, et al.Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7103-7112. [8] LI W B, WANG Z C, YIN B Y, et al.Rethinking on multi-stage networks for human pose estimation[EB/OL].[2021-09-05].https://arxiv.org/abs/1901.00148. [9] QI T, BAYRAMLI B, ALI U, et al.Spatial shortcut network for human pose estimation[EB/OL].[2021-09-05].https://arxiv.org/abs/1904.03141. [10] SUN K, XIAO B, LIU D, et al.Deep high-resolution representation learning for human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:5686-5696. [11] CAO Z, HIDALGO G, SIMON T, et al.OpenPose:realtime multi-person 2D pose estimation using part affinity fields[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(1):172-186. [12] MARTINEZ G H, RAAJ Y, IDREES H, et al.Single-network whole-body pose estimation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:6981-6990. [13] CHENG B W, XIAO B, WANG J D, et al.HigherHRNet:scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:5385-5394. [14] GENG Z G, SUN K, XIAO B, et al.Bottom-up human pose estimation via disentangled keypoint regression[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:14671-14681. [15] TSOTSOS J K.Analyzing vision at the complexity level[J].Behavioral and Brain Sciences, 1990, 13(3):423-445. [16] TSOTSOS J K.A computational perspective on visual attention[M].Cambridge, USA:MIT Press, 2011. [17] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141. [18] WOO S, PARK J, LEE J Y, et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2018:3-19. [19] HU J, SHEN L, ALBANIE S, et al.Gather-excite:exploiting feature context in convolutional neural networks[EB/OL].[2021-09-05].https://arxiv.org/abs/1810.12348. [20] LINSLEY D, SHIEBLER D, EBERHARDT S, et al.Learning what and where to attend[EB/OL].[2021-09-05].https://arxiv.org/abs/1805.08819. [21] BELLO I, ZOPH B, LE Q, et al.Attention augmented convolutional networks[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:3285-3294. [22] MISRA D, NALAMADA T, ARASANIPALAI A U, et al.Rotate to attend:convolutional triplet attention module[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision.Washington D.C., USA:IEEE Press, 2021:3138-3147. [23] WANG X L, GIRSHICK R, GUPTA A, et al.Non-local neural networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7794-7803. [24] CAO Y, XU J, LIN S, et al.Gcnet:non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of IEEE/CVF International Conference on Computer Vision Workshops.Washington D.C., USA:IEEE Press, 2019:12-36. [25] CHEN Y P, KALANTIDIS Y, LI J S, et al.A2-Nets:double attention networks[EB/OL].[2021-09-05].https://arxiv.org/abs/1810.11579. [26] LIU J J, HOU Q B, CHENG M M, et al.Improving convolutional networks with self-calibrated convolutions[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:10093-10102. [27] GAO Z L, XIE J T, WANG Q L, et al.Global second-order pooling convolutional networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:3019-3028. [28] HUANG Z L, WANG X G, WEI Y C, et al.CCNet:criss-cross attention for semantic segmentation[C]//Proceedings of IEEE Conference on Pattern Analysis and Machine Intelligence.Washington D.C., USA:IEEE Press, 2019:603-612. [29] XIAO B, WU H, WEI Y.Simple baselines for human pose estimation and tracking[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2018:466-481. [30] 罗梦诗, 徐杨, 叶星鑫.融入双注意力的高分辨率网络人体姿态估计[J].计算机工程, 2022, 48(2):314-320. LUO M S, XU Y, YE X X.Human pose estimation using high resolution network with dual attention[J].Computer Engineering, 2022, 48(2):314-320.(in Chinese) |