计算机工程 ›› 2019, Vol. 45 ›› Issue (12): 257-262.doi: 10.19678/j.issn.1000-3428.0053184

• 多媒体技术及应用 • 上一篇    下一篇

基于行为主体检测的视频行为快速检测

张杰豪, 陈华杰, 姚勤炜, 侯新雨   

  1. 杭州电子科技大学 自动化学院, 杭州 310018
  • 收稿日期:2018-11-20 修回日期:2018-12-27 发布日期:2019-03-07
  • 作者简介:张杰豪(1994-),男,硕士研究生,主研方向为视频检测、模式识别、机器学习;陈华杰,教授、博士;姚勤炜、侯新雨,硕士研究生。

Fast Video Action Detection Based on Action Subject Detection

ZHANG Jiehao, CHEN Huajie, YAO Qinwei, HOU Xinyu   

  1. School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou 310018, China
  • Received:2018-11-20 Revised:2018-12-27 Published:2019-03-07

摘要: 现有视频行为检测方法在生成候选区域时采用滑窗操作,处理长视频速度较慢。针对该问题,通过对静态行为主体进行定位,提出一种快速检测方法。将长视频分割为若干个视频单元,在每个单元的第1帧中运用Fast R-CNN算法进行行为主体检测,对检测到行为主体的单元划定时间区域生成行为发生候选区域,以减少行为检测网络的输入数据。在此基础上,采用3D卷积神经网络判别候选区域类别,对行为类区域进行边界回归,得到准确的行为时间轴定位。实验结果表明,该方法检测速度较TURN方法提升2倍以上,其mAP指标只降低0.7%。

关键词: 行为检测, 行为主体检测, 边界回归, 3D卷积神经网络, 视频单元

Abstract: The existing video action detection methods adopt the sliding window operation when generating candidate regions, which process long video speeds slowly.Aiming at this problem,a fast detection method is proposed by detecting the static action subject.First,a long video is divided into several units,and the Fast R-CNN algorithm is adopted to detect the action subject in the first frame of each unit.Then,time zones are defined in the units with action subject to generate action occurrence candidate regions,so as to reduce the input data of the action detection network.On this basis,this paper uses 3D Convolutional Neural Network(CNN) to discriminate the classification of candidate regions.Finally,the boundary regression is performed on action regions,thus obtaining an accurate action time axis positioning.Experimental results show that the detection speed of the proposed method is 2 times higher than that of the TURN method,with an mAP indicator decrease by merely 0.7%.

Key words: action detection, action subject detection, boundary regression, 3D Convolutional Neural Network(CNN), video unit

中图分类号: