作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (7): 71-78. doi: 10.19678/j.issn.1000-3428.0069491

• 智慧教育 • 上一篇    下一篇

特征融合下田径录像3D人体动作DTW捕捉算法

谭巨全1,*(), 王然2   

  1. 1. 华南农业大学体育教学研究部, 广东 广州 510100
    2. 华南师范大学体育科学学院, 广东 广州 510006
  • 收稿日期:2024-03-05 出版日期:2024-07-15 发布日期:2024-07-09
  • 通讯作者: 谭巨全
  • 基金资助:
    广东省体育局2022-2023年科技创新和体育文化发展科研普通项目(GDSS2022N060); 广州市哲学社会科学发展“十四五”规划2021年度共建课题(2021GZGJ310)

Dynamic Time Warping Capture Algorithm for 3D Human Body Movements in Track and Field Video Recording Under Feature Fusion

Juquan TAN1,*(), Ran WANG2   

  1. 1. College of SCAU P.E. Teaching, Guangzhou 510100, Guangdong, China
    2. School of Physical Education and Sports Science, South China Normal University, Guangzhou 510006, Guangdong, China
  • Received:2024-03-05 Online:2024-07-15 Published:2024-07-09
  • Contact: Juquan TAN

摘要:

由于不同运动员的动作风格和速度不同, 导致田径录像中记录的不同运动员的人体动作序列在时间长度上存在差异, 从而产生不同长度序列之间的不对齐问题。为此, 提出特征融合下田径录像三维(3D)人体动作动态时间规整(DTW)捕捉算法。将田径录像中提取的人体动作数据转换为3D坐标序列, 表示人体各部位的位置和动作, 获取人体动作的深度图序列, 利用梯度局部各向异性系数(GLAC)和稀疏时间梯度自相关(STACOG)分析深度图局部区域的梯度特性以及时间自相关性。利用Canny算子提取每帧深度图的边缘轮廓, 结合小波变换和k-means聚类分析人体轮廓的动态变化特征。通过Kinect设备获取人体骨骼点的3D坐标信息, 采用主成分分析(PCA)方法将3D空间坐标、关节角度等多个特征融合到一个特征空间中, 通过补帧和删帧操作进行预处理, 选择最重要的主成分构建新的低维特征空间。利用DTW计算视频序列的相似性, 从而捕捉田径录像中不同长度序列的3D人体动作。实验结果表明, 该算法捕捉田径录像3D人体动作的准确程度高达99.07%, 面对复杂程度较高的动作时, 该算法捕捉到的人体动作与实际动作的相似度始终保持在97%以上。此外, 所提算法提取到的人体动作轮廓特征线条流畅、连续, 且与实际动作高度一致。

关键词: 特征融合, 田径录像, 3D人体动作, 动态时间规整算法, 动作捕捉

Abstract:

Due to the varying movement styles and speeds among athletes, track and field videos may have sequences of differing lengths, causing misalignment. Thus, a feature fusion based Dynamic Time Warping (DTW) algorithm for capturing 3D human body movements in track and field videos is proposed. Human motion data from the track and field videos are converted into 3D coordinates, which represent body positions and movements. A depth map sequence is then obtained, and Gradient Local Anisotropy Coefficient (GLAC) and Sparse Time Auto Correlation of Gradients (STACOG) are used to analyze gradient characteristics and temporal autocorrelation of local areas in the depth map. Wavelet transform and k-means clustering are combined to analyze the dynamic changes in the human body contours using the Canny operator, which extracts edge contours of each depth map frame. Principal Component Analysis (PCA) method is used to fuse multiple features, such as 3D spatial coordinates and joint angles, into one feature space by employing Kinect devices, which extracts 3D coordinate information of human skeletal points. Preprocessing is carried out through frame filling and deletion operations, and the most important principal components are selected to construct a new low dimensional feature space. The DTW algorithm is used to calculate the similarity in the video sequences and capture 3D human movements in sequences of differing lengths in track and field recordings. The experimental results show that the accuracy of the algorithm in capturing 3D human movements in track and field videos reaches 99.07%. When faced with complex actions, the similarity between the human actions captured by this algorithm and actual actions remains above 97%. The human action contour feature lines extracted by this algorithm are smooth, continuous, and highly consistent with the actual action.

Key words: feature fusion, track and field video recording, 3D human body movements, Dynamic Time Warping(DTW) algorithm, action capture