[1] ZOU Z X, CHEN K Y, SHI Z W, et al. Object detection in 20 years:a survey[J]. Proceedings of the IEEE, 2023, 111(3):257-276. [2] 刘勇, 李杰, 张建林, 等. 基于深度学习的二维人体姿态估计研究进展[J]. 计算机工程, 2021, 47(3):1-16. LIU Y, LI J, ZHANG J L, et al. Research progress of two-dimensional human pose estimation based on deep learning[J]. Computer Engineering, 2021, 47(3):1-16. (in Chinese) [3] XIAO B, WU H P, WEI Y C. Simple baselines for human pose estimation and tracking[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany:Springer, 2018:472-487. [4] CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2018:7103-7112. [5] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2019:5693-5703. [6] TOSHEV A, SZEGEDY C. DeepPose:human pose estimation via deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2014:1653-1660. [7] FANG H S, XIE S Q, TAI Y W, et al. RMPE:regional multi-person pose estimation[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA:IEEE Press, 2017:2334-2343. [8] NEWELL A, HUANG Z A, DENG J. Associative embedding:end-to-end learning for joint detection and grouping[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA:ACM Press, 2017:2274-2284. [9] CHENG B W, XIAO B, WANG J D, et al. HigherHRNet:scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2020:5386-5395. [10] KREISS S, BERTONI L, ALAHI A. PifPaf:composite fields for human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2019:11977-11986. [11] CAO Z, SIMON T, WEI S, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2017:7291-7299. [12] MAJI D, NAGORI S, MATHEW M, et al. YOLO-Pose:enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2022:2637-2646. [13] JOCHER G. YOLOv5 release v6.1[EB/OL].[2023-02-10]. https://github.com/ultralytics/yolov5/releases/tag/v6.1. [14] QIU H B, WANG C Y, WANG J D, et al. Cross view fusion for 3D human pose estimation[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA:IEEE Press, 2019:4342-4351. [15] DU S L, WANG H, YUAN Z W, et al. Bi-Pose:bidirectional 2D-3D transformation for human pose estimation from a monocular camera[EB/OL].[2023-02-10]. https://ieeexplore.ieee.org/document/10141872. [16] LIU S G, LI Y, HUA G G. Human pose estimation in video via structured space learning and halfway temporal evaluation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(7):2029-2038. [17] ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5:improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA:IEEE Press, 2021:2778-2788. [18] 王程, 刘元盛, 刘圣杰. 基于改进YOLOv4的小目标行人检测算法[J]. 计算机工程, 2023, 49(2):296-302, 313. WANG C, LIU Y S, LIU S J. Small-target pedestrian-detection algorithm based on improved YOLOv4[J]. Computer Engineering, 2023, 49(2):296-302, 313. (in Chinese) [19] TAN M X, PANG R M, LE Q V. EfficientDet:scalable and efficient object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2020:10781-10790. [20] LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2018:8759-8768. [21] GEVORGYAN Z. SIoU Loss:more powerful learning for bounding box regression[EB/OL].[2023-02-10]. https://arxiv.org/abs/2205.12740. [22] 胡欣, 周运强, 肖剑, 等. 基于改进YOLOv5的螺纹钢表面缺陷检测[J]. 图学学报, 2023, 44(3):427-437. HU X, ZHOU Y Q, XIAO J, et al. Surface defect detection of threaded steel based on improved YOLOv5[J]. Journal of Graphics, 2023, 44(3):427-437. (in Chinese) [23] ARTHUR D, VASSILVITSKII S. k-means++:the advantages of careful seeding[C]//Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. New York, USA:ACM Press, 2007:1027-1035. [24] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet:a new backbone that can enhance learning capability of CNN[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2020:390-391. [25] ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8):8574-8586. [26] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO:common objects in context[C]//Proceedings of ECCV 2014. Berlin, Germany:Springer, 2014:740-755. [27] WOO S, PARK J, LEE J Y, et al. CBAM:convolutional block attention module[C]//Proceedings of ECCV 2018. Berlin, Germany:Springer, 2018:3-19. [28] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2017:2117-2125. [29] BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4:optimal speed and accuracy of object detection[EB/OL].[2023-02-10]. https://arxiv.org/abs/2004.10934. [30] LI J F, WANG C, ZHU H, et al. CrowdPose:efficient crowded scenes pose estimation and a new benchmark[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2019:10863-10872. [31] ZHANG S F, XIE Y L, WAN J, et al. WiderPerson:a diverse dataset for dense pedestrian detection in the wild[J]. IEEE Transactions on Multimedia, 2020, 22(2):380-393. [32] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7:trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA:IEEE Press, 2023:7464-7475. |