Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (10): 313-320. doi: 10.19678/j.issn.1000-3428.0062643

• Development Research and Engineering Application • Previous Articles    

Unsupervised Video Person Re-identification with Complementary Temporal Features

WANG Fuyin, HAN Hua, HUANG Li, CHEN Yiping   

  1. College of Electronic & Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2021-09-10 Revised:2021-10-31 Published:2021-11-15

时间特征互补的无监督视频行人重识别

王福银, 韩华, 黄丽, 陈益平   

  1. 上海工程技术大学 电子电气工程学院, 上海 201620
  • 作者简介:王福银(1991—),男,硕士,主研方向为行人重识别、机器学习、深度学习;韩华(通信作者),教授;黄丽、陈益平,讲师、博士。
  • 基金资助:
    国家自然科学基金(61305014);上海市教育委员会和上海市教育发展基金会“晨光计划”(13CG60)。

Abstract: Current video person re-identification methods cannot effectively extract spatiotemporal information between video frames, and the manual labeling problem must be solved.Therefore, this article proposes an unsupervised video person re-identification method with complementary temporal features.The Temporal Feature Erasure(TFE) network module extracts temporal and spatial information features between video frames while mining different features of pedestrians and reducing the redundancy of features in each frame, thus obtaining the complete features of different visuals of the target pedestrian.The constrained unsupervised hierarchical clustering module obtains high-quality distinct identity clusters by calculating the distance between samples.Clustering based on the distance between clusters is used to generate high-quality pseudo-labels for improving the recognition of extremely similar distinct identity samples.PK sampling of difficult-sample triad loss generates a new dataset by extracting samples from the already clustered results to perform training after each clustering iteration.This reduces the impact of difficult samples on the model.The method is evaluated experimentally on the Motion Analysis and Re-identification Set(MARS) and Duke Multi-Tracking Multi-Camera Video-based Re-IDentification (DukeMTMC-VideoReID) datasets, with a mean Average Precision (mAP) reaching 46.4% and 72.5%, respectively, and Rank-1 reaching 69.3% and 80.5%, respectively.The performance index is superior to those of the traditional RACM and DAL methods.

Key words: unsupervised video person re-identification, dispersion, clustering, diversity constraint, temporal feature

摘要: 目前的视频行人重识别方法不能有效提取视频帧之间的时空信息,且需要解决人工标签的问题,提出一种时间特征互补的无监督视频行人重识别方法。利用时间特征擦除网络模块对视频帧与帧之间的时间信息特征及空间信息特征进行擦除提取,挖掘行人不同的特征以减少每帧特征的冗余,进而得到目标行人不同视觉的完整特征。通过约束性无监督层次聚类模块计算每个样本之间的距离得到高质量的不同身份集群,根据集群之间距离进行聚类生成高质量的伪标签,提高不同身份极度相似的样本识别性,并根据PK抽样困难样本三元组损失模块从已经聚类好的结果中抽取样本生成一个新的数据集,以便在每次聚类迭代后进行训练,减少困难样例对模型的影响。在MARS数据集和DukeMTMC-VideoReID数据集上的实验结果表明,该方法的平均精度均值分别达到了46.4%和72.5%,Rank-1分别达到了69.3%和80.5%,性能指标优于传统的RACM和DAL等方法。

关键词: 无监督视频行人重识别, 离散度, 聚类, 多样性约束, 时间特征

CLC Number: