Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (2): 136-147. doi: 10.19678/j.issn.1000-3428.0070157

• Computer Vision and Image Processing • Previous Articles    

SRMpose:Multi-Scale Feature Extraction Keypoint Detection Algorithm

DAN Chonghong1,2, WEI Honglei1,2, HE Zhou1,2, WU Guanfeng1,2   

  1. 1. School of Mathematics, Southwest Jiaotong University, Chengdu 611756, Sichuan, China;
    2. National-Local Joint Engineering Laboratory of System Credibility Automatic Verification, Chengdu 611756, Sichuan, China
  • Received:2024-07-22 Revised:2024-09-10 Published:2026-02-04

SRMpose:一种多尺度特征提取的关键点检测算法

但崇鸿1,2, 韦洪雷1,2, 何舟1,2, 吴贯锋1,2   

  1. 1. 西南交通大学数学学院, 四川 成都 611756;
    2. 系统可信性自动验证国家地方联合工程实验室, 四川 成都 611756
  • 作者简介:但崇鸿,男,硕士研究生,主研方向为数据处理、深度学习、计算机视觉;韦洪雷(通信作者),高级工程师、博士,E-mail:whl@swjtu.edu.cn;何舟,硕士;吴贯锋,工程师、博士。
  • 基金资助:
    教育部人文社会科学规划基金(23YJA890038)。

Abstract: Human keypoint detection is increasingly being applied in fields such as motion behavior recognition and human-computer interaction. Taking the case of long jump, this study proposes a multi-scale feature extraction keypoint detection algorithm to improve the accuracy of human keypoint detection and reduce computational and parameter complexity. This algorithm is combined with an implementation of intelligent distance detection. First, the study constructs the LJDataset dataset to fill the gap in the current long jump dataset; then, based on the YOLOv8 training framework, it proposes a new model, SRMpose, with low parameter count and low computational complexity. The model uses StarBlock to build a backbone network; designs Multi-channel Residual Block (MRB) and semi-coupled detection head, SRMhead, modules to extract features; and introduces lightweight sampling operators, ADown and DySample, to improve the processing efficiency of feature maps. The model is validated on three datasets: LJDataset, MPII, and COCO. Compared with YOLOv8n-pose, SRMpose performs better on the three datasets, with mAP@0.5 and mAP@0.5∶0.95 increasing by 2.2 and 1.4 percentage point, 3.6 and 2.6 percentage point, 1.9 and 1.2 percentage point, respectively. On average, parameter quantity increases by 3.3% and GFLOPs decrease by 21.7%. In addition, on the COCO and LJDataset datasets, compared with YOLOv8s, SRMpose's parameter count decreases by an average of 48.3%, GFLOPs decrease by an average of 59.6%, and mAP@0.5 decreases by 1.4 percentage point and increases by 0.3 percentage point, respectively, proving that SRMpose effectively reduces the number of parameters and computations while ensuring model performance. On the LJDataset dataset, the model validation dataset was adjusted to the COCO validation dataset. The results show that the performance gap between SRMpose and YOLOv8s is less than 1 percentage point, proving the comprehensive performance advantage and generalization ability of SRMpose. Moreover, LJDataset dataset has a certain level of complexity and can cover most of the human body keypoint recognition features.

Key words: human keypoint detection, multi-scale feature extraction, long jump detection, attention mechanism, lightweight network

摘要: 人体关键点检测在运动行为识别、人机交互等领域的应用越来越广泛。为进一步提高人体关键点检测的精度,减少计算量和参数量,以跳远运动为例,提出一种多尺度特征提取的关键点检测算法,并结合该算法实现智能距离检测。首先,构建LJDataset数据集,填补当下跳远运动数据集的不足;然后,基于YOLOv8训练框架提出一种参数量低、计算量小的新模型SRMpose,该模型使用StarBlock搭建骨干网络,设计MRB(Multi-channel Residual Block)、半耦合检测头SRMhead模块来提取特征,引入轻量化采样算子ADown和DySample提高特征图的处理效率;最后,在LJDataset、MPII、COCO数据集上进行实验验证。实验结果表明,与YOLOv8n-pose相比,SRMpose模型在3个数据集上的mAP@0.5和mAP@0.5∶0.95分别提高了2.2和1.4百分点、3.6和2.6百分点、1.9和1.2百分点,参数量平均提高了3.3%,GFLOPs平均减少了21.7%。此外,在COCO、LJDataset数据集上,与YOLOv8s相比,SRMpose的参数量平均减少了48.3%,GFLOPs平均减少了59.6%,而mAP@0.5分别降低了1.4百分点和提升了0.3百分点,证明SRMpose在保证模型性能的前提下有效减少了参数量和计算量。在LJDataset数据集上,将模型验证数据集调整为COCO验证集,结果表明,SRMpose与YOLOv8s的性能差距小于1百分点,证明了SRMpose的综合性能优势和泛化能力,也表明LJDataset数据集具有一定的复杂度,可以覆盖大部分人体关键点识别特征。

关键词: 人体关键点检测, 多尺度特征提取, 跳远检测, 注意力机制, 轻量化网络

CLC Number: