Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (6): 93-101. doi: 10.19678/j.issn.1000-3428.0069350

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Action Detection Method Based on Salient Target Tracking

SHAN Pengchang, GAO Lijian, DONG Wenlong, MAO Qirong*()   

  1. School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, Jiangsu, China
  • Received:2024-02-04 Online:2025-06-15 Published:2025-06-05
  • Contact: MAO Qirong

基于显著目标追踪的行为检测方法

单鹏畅, 高利剑, 董文龙, 毛启容*()   

  1. 江苏大学计算机科学与通信工程学院,江苏 镇江 212013
  • 通讯作者: 毛启容
  • 基金资助:
    江苏省重点研发计划(BE2020036); 江苏省研究生科研与实践创新计划项目(KYCX23_3675); 江苏大学应急管理学院专项科研项目(KY-A-01)

Abstract:

Action detection comprises both action classification and boundary localization, with a predominant focus on action and boundary features. Current methods neglect the significance of spatial features in this task and suffer from ambiguous action boundary prediction, which affects the performance and application of action detection models. To address these challenges, this paper proposes a Salient Object Tracking-based Action Detection (SOT-AD) method. First, to learn salient spatial information at different scales, a hierarchical attention network is introduced to capture salient objects associated with actions, while reducing interference from action-irrelevant information. Second, to ensure consistency in salient object attention across adjacent temporal positions, this paper proposes a salient object tracking loss. Neutral samples are introduced to construct a ″target-sub-target-background″ feature pool to learn temporal contextual information for feature sequences, which facilitates the realization of salient object tracking. Experimental results on two widely used datasets, THUMOS14 and ActivityNet1.3, demonstrate that SOT-AD outperforms mainstream methods with improvements of 0.9 percentage points and 0.6 percentage points in terms of mean Average Precision (mAP), respectively. Notably, on the THUMOS14 dataset, SOT-AD achieves an mAP@0.5 of 72.7%.

Key words: Action Detection(AD), attention mechanism, noise contrast loss, action tracking, feature pyramid

摘要:

行为检测任务包含行为分类和边界定位,往往关注行为特征和边界特征。已有方法通常忽略了行为空间特征对于该任务的重要性,并存在行为边界预测模糊的问题,影响行为检测模型的性能和应用效果。针对以上问题,提出一种基于显著目标追踪的行为检测(SOT-AD)方法。首先,为了学习不同尺度的显著空间信息,提出分级注意力网络,旨在捕捉与行为关联的显著目标,减少与行为无关的信息的干扰;其次,为了使相邻时序位置关注到的显著目标具有一致性,提出显著目标追踪损失;最后,引入中性样本辅助构造“目标-次目标-背景”特征池,旨在学习特征时序上下文信息,实现显著目标追踪。在THUMOS14和ActivityNet1.3两个通用数据集上的实验结果表明,与主流方法相比,SOT-AD在平均精度均值(mAP)指标上分别平均提升了0.9和0.6百分点。其中,在THUMOS14数据集上,SOT-AD的mAP@0.5达到72.7%。

关键词: 行为检测, 注意力机制, 噪声对比损失, 行为追踪, 特征金字塔