DAN Chonghong, WEI Honglei, HE Zhou, WU Guanfeng
Human keypoint detection is increasingly being applied in fields such as motion behavior recognition and human-computer interaction. Taking the case of long jump, this study proposes a multi-scale feature extraction keypoint detection algorithm to improve the accuracy of human keypoint detection and reduce computational and parameter complexity. This algorithm is combined with an implementation of intelligent distance detection. First, the study constructs the LJDataset dataset to fill the gap in the current long jump dataset; then, based on the YOLOv8 training framework, it proposes a new model, SRMpose, with low parameter count and low computational complexity. The model uses StarBlock to build a backbone network; designs Multi-channel Residual Block (MRB) and semi-coupled detection head, SRMhead, modules to extract features; and introduces lightweight sampling operators, ADown and DySample, to improve the processing efficiency of feature maps. The model is validated on three datasets: LJDataset, MPII, and COCO. Compared with YOLOv8n-pose, SRMpose performs better on the three datasets, with mAP@0.5 and mAP@0.5∶0.95 increasing by 2.2 and 1.4 percentage point, 3.6 and 2.6 percentage point, 1.9 and 1.2 percentage point, respectively. On average, parameter quantity increases by 3.3% and GFLOPs decrease by 21.7%. In addition, on the COCO and LJDataset datasets, compared with YOLOv8s, SRMpose's parameter count decreases by an average of 48.3%, GFLOPs decrease by an average of 59.6%, and mAP@0.5 decreases by 1.4 percentage point and increases by 0.3 percentage point, respectively, proving that SRMpose effectively reduces the number of parameters and computations while ensuring model performance. On the LJDataset dataset, the model validation dataset was adjusted to the COCO validation dataset. The results show that the performance gap between SRMpose and YOLOv8s is less than 1 percentage point, proving the comprehensive performance advantage and generalization ability of SRMpose. Moreover, LJDataset dataset has a certain level of complexity and can cover most of the human body keypoint recognition features.