Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Non-convex Temporal Difference Low-Rank Constrained Motion Discontinuous Spatio-Temporal Behavior Understanding

  

  • Published:2026-05-07

非凸时序差分低秩约束的运动非连续时空行为理解

Abstract: To address the challenges of low-rank degradation and partial observation in spatio-temporal data caused by unstructured occlusion, motion distortion, and multi-source noise coupling in complex dynamic scenes, this paper proposes a motion discontinuous spatio-temporal behavior understanding framework that integrates non-convex temporal difference low-rank constraints and hierarchical trajectory-behavior semantic mapping. Firstly, a temporal difference low-rank recovery model based on the non-convex Schatten-p norm is constructed, and the Alternating Direction Method of Multipliers (ADMM) is employed to reconstruct motion data under high missing rates and noise pollution. Secondly, based on the recovered data, structured trajectory clusters are built by combining multi-object tracking, and trajectory neighborhood interaction features are extracted. Furthermore, a three-level behavior understanding model is proposed: behavior primitive classification based on multilayer perceptrons, interaction pattern recognition based on graph attention networks, and semantic fusion and behavior narrative generation incorporating spatio-temporal context, achieving end-to-end mapping from trajectories to high-level semantics. Experiments show that the proposed method significantly outperforms baseline approaches in data recovery quality under a 60% high missing rate, achieving behavior recognition accuracies of 92.7% on both the NTU RGB+D (X-Sub) and the self-built motion dataset BAS, which is 5.6 percentage points higher than the best comparative method. Ablation studies further validate the effectiveness of each module: the NTDLR recovery module improves the recognition rate from 78.3% to 86.7% under 60% missing data, trajectory neighborhood encoding enhances it to 88.2%, and the complete three-level model achieves optimal performance through synergistic interaction. The results of interaction pattern recognition and semantic description generation also notably surpass those of mainstream graph convolutional networks and their variants. This research provides an interpretable and scalable algorithmic framework for discontinuous and interactive motion behavior understanding in complex dynamic scenes.

摘要: 针对复杂动态场景中因非结构化遮挡、运动畸变与多源噪声耦合导致的时空数据低秩退化与部分观测难题,本文提出一种融合非凸时序差分低秩约束与层级化轨迹‑行为语义映射的运动非连续时空行为理解框架。首先,构建基于非凸Schatten‑p范数的时序差分低秩恢复模型,采用交替方向乘子法实现高缺失与噪声污染下的运动数据重建;其次,在恢复数据基础上结合多目标跟踪构建结构化轨迹簇,并提取轨迹邻域交互特征;进而,提出一个三层级行为理解模型:基于多层感知机的行为基元分类、基于图注意力网络的交互模式识别,以及融合时空上下文的语义融合与行为叙事生成,实现从轨迹到高层语义的端到端映射。实验表明:所提方法在60%高缺失率下恢复质量显著优于基线,在NTU RGB+D(X‑Sub)与自建运动数据集BAS上的行为识别准确率均达到92.7%,较最优对比方法提升5.6个百分点;消融实验进一步验证了各模块的有效性,其中NTDLR恢复模块在60%缺失下将识别率从78.3%提升至86.7%,轨迹邻域编码提升至88.2%,完整三层级模型协同作用下达到最优性能。交互模式识别与语义描述生成亦显著优于主流图卷积网络及其变体。本研究为复杂动态场景下非连续、交互式运动行为理解提供了可解释、可扩展的算法框架。