| 1 |  WANG Z G .  Real-time dance posture tracking method based on lightweight network. Wireless Communications and Mobile Computing, 2022, 2022 (1): 5001896.  doi: 10.1155/2022/5001896
 | 
																													
																							| 2 | LIU Z G, FENG R Y, CHEN H M, et al. Temporal feature alignment and mutual information maximization for video-based human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 10996-11006. | 
																													
																							| 3 |  YI C Z ,  JIANG F ,  ZHANG S P , et al.  Continuous prediction of lower-limb kinematics from multi-modal biomedical signals. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32 (5): 2592- 2602.  doi: 10.1109/TCSVT.2021.3071461
 | 
																													
																							| 4 |  CUI C ,  MA Y S ,  CAO X , et al.  Receive, reason, and react: drive as you say, with large language models in autonomous vehicles. IEEE Intelligent Transportation Systems Magazine, 2024, 16 (4): 81- 94.  doi: 10.1109/MITS.2024.3381793
 | 
																													
																							| 5 | 李玉荣.  基于计算机视觉技术的智能化课堂管理系统研究. 通信与信息技术, 2024 (2): 130- 136. | 
																													
																							|  |  LI Y R .  Research on intelligent classroom management system based on computer vision technology. Communication & Information Technology, 2024 (2): 130- 136. | 
																													
																							| 6 | 孔令凯, 王森.  人工智能辅助姿态识别和运动处方的研究. 现代电子技术, 2024, 47 (4): 139- 142. | 
																													
																							|  |  KONG L K ,  WANG S .  Research on artificial intelligence assisted motion recognition and exercise prescription. Modern Electronics Technique, 2024, 47 (4): 139- 142. | 
																													
																							| 7 | 杨蕊婷, 袁磊, 林勤, 等.  基于人体姿态估计的仰卧起坐动作诊断系统. 通信与信息技术, 2022 (S2): 80- 82. | 
																													
																							|  |  YANG R T ,  YUAN L ,  LIN Q , et al.  A technical action diagnosis of sit ups based on human posture estimation. Communication & Information Technology, 2022 (S2): 80- 82. | 
																													
																							| 8 | PISHCHULIN L, ANDRILUKA M, GEHLER P, et al. Poselet conditioned pictorial structures[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE Press, 2013: 588-595. | 
																													
																							| 9 |  YANG Y ,  RAMANAN D .  Articulated human detection with flexible mixtures of parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35 (12): 2878- 2890.  doi: 10.1109/TPAMI.2012.261
 | 
																													
																							| 10 | SUN M, SAVARESE S. Articulated part-based model for joint object detection and pose estimation[C]//Proceedings of the International Conference on Computer Vision. Barcelona, Spain: IEEE Press, 2011: 723-730. | 
																													
																							| 11 | TIAN Y, ZITNICK C L, NARASIMHAN S G. Exploring the spatial hierarchy of mixture models for human pose estimation[C]//Proceedings of the European Conference on Computer Vision. Florence, Italy: Springer, 2012: 256-269. | 
																													
																							| 12 |  KRIZHEVSKY A ,  SUTSKEVER I ,  HINTON G E .  ImageNet classification with deep convolutional neural networks. Communications of the ACM, 2017, 60 (6): 84- 90.  doi: 10.1145/3065386
 | 
																													
																							| 13 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: IEEE Press, 2017: 6000-6010. | 
																													
																							| 14 | NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]//Proceedings of European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 483-499. | 
																													
																							| 15 | CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 7103-7112. | 
																													
																							| 16 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE Press, 2019: 5686-5696. | 
																													
																							| 17 | PISHCHULIN L, INSAFUTDINOV E, TANG S Y, et al. DeepCut: joint subset partition and labeling for multi person pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE Press, 2016: 4929-4937. | 
																													
																							| 18 | CAO Z, SIMON T, WEI S H, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE Press, 2017: 1302-1310. | 
																													
																							| 19 | WANG D K, ZHANG S L. Contextual instance decoupling for robust multi-person pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 11050-11058. | 
																													
																							| 20 | MAJI D, NAGORI S, MATHEW M, et al. YOLO-pose: enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 2636-2645. | 
																													
																							| 21 | SHI D H, WEI X, LI L Q, et al. End-to-end multi-person pose estimation with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE Press, 2022: 11059-11068. | 
																													
																							| 22 | LI Y J, ZHANG S K, WANG Z C, et al. TokenPose: learning keypoint tokens for human pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE Press, 2021: 11293-11302. | 
																													
																							| 23 |  LI K C ,  WANG Y L ,  ZHANG J H , et al.  UniFormer: unifying convolution and self-attention for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10): 12581- 12600.  doi: 10.1109/TPAMI.2023.3282631
 | 
																													
																							| 24 |  | 
																													
																							| 25 |  | 
																													
																							| 26 |  XU Y F ,  ZHANG J ,  ZHANG Q M , et al.  ViTPose: simple vision transformer baselines for human pose estimation. Advances in Neural Information Processing Systems, 2022, 35, 38571- 38584. | 
																													
																							| 27 |  | 
																													
																							| 28 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[EB/OL]. [2024-02-01]. https://arxiv.org/abs/1810.04805v2 . | 
																													
																							| 29 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. [2024-02-01]. https://arxiv.org/abs/2010.11929 . | 
																													
																							| 30 | WANG Y L, HUANG R, SONG S J, et al. Not all images are worth 16×16 words: dynamic transformers for efficient image recognition[EB/OL]. [2024-02-01]. https://arxiv.org/abs/2105.15075v2 . | 
																													
																							| 31 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 740-755. | 
																													
																							| 32 | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE Press, 2014: 3686-3693. | 
																													
																							| 33 | XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking[C]//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018: 466-481. | 
																													
																							| 34 | DAS A, DAS S, SISTU G, et al. Deep multi-task networks for occluded pedestrian pose estimation[C]//Proceedings of the 24th Irish Machine Vision and Image Processing Conference. [S. l. ]: Irish Pattern Recognition and Classification Society, 2022: 177-180. | 
																													
																							| 35 |  | 
																													
																							| 36 | YANG S, QUAN Z B, NIE M, et al. TransPose: keypoint localization via transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE Press, 2021: 11782-11792. | 
																													
																							| 37 | ZHAO S T, LIU K, HUANG Y H, et al. DPIT: dual-pipeline integrated transformer for human pose estimation[C]//Proceedings of CAAI International Conference on Artificial Intelligence. Cham, Germany: Springer, 2022: 559-576. |