作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 302-309. doi: 10.19678/j.issn.1000-3428.0065333

• 开发研究与工程应用 • 上一篇    下一篇

基于双中间模态的四流网络跨模态行人重识别

韩华, 黄丽, 田瑾, 王春媛   

  1. 上海工程技术大学 电子电气工程学院 上海市数据智能技术及其应用协同创新中心, 上海 201620
  • 收稿日期:2022-07-25 出版日期:2023-08-15 发布日期:2022-11-14
  • 作者简介:

    韩华(1983—),女,教授、博士,主研方向为行人重识别、模式识别

    黄丽,讲师、博士

    田瑾,副教授、博士

    王春媛,副教授、博士

  • 基金资助:
    国家自然科学基金(62103257); 国家自然科学基金(61305014); 科技创新2030—“新一代人工智能”重大项目(2020AAA0109300); 上海市自然科学基金(22ZR1426200); 上海市教育委员会和上海市教育发展基金会“晨光计划”(13CG60)

Cross-Modality Person Re-identification Using Four-Stream Network Based on Dual-Intermediate Modalities

Hua HAN, Li HUANG, Jin TIAN, Chunyuan WANG   

  1. Shanghai Data Intelligence Technology and Application Collaborative Innovation Center, School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2022-07-25 Online:2023-08-15 Published:2022-11-14

摘要:

摄像头大多配备红外和可见光功能,因此,重识别方法的应用必然要解决跨模态行人重识别问题。为缩小跨模态行人重识别中红外和可见光模态之间的差异,提高识别精度,提出基于双中间模态的四流跨模态行人重识别方法。由2个轻量级网络分别生成可见光模态和红外模态的双中间模态图像,并从可见光图像和红外图像中继承标签,通过拆分ResNet50骨干网络以重构适应于4种模态共享特征学习的网络。此外,还探讨了四流骨干网络中的参数共享问题,分析四模态共享块数量对于跨模态行人重识别的影响。实验结果表明,相比HcTri,该方法在SYSU-MM01数据集上的全局检索模式下的Rank-1和mAP分别提高2.38和4.64个百分点,在室内检索模式下分别提高6.24和6.77个百分点,在RegDB数据集上可见光至红外检索模式下的Rank-1、mAP和mINP分别提高2.52、3.74和4.68个百分点,在红外至可见光检索模式下的Rank-1、mAP和mINP分别分别提高2.70、3.47和5.56个百分点。

关键词: 行人重识别, 双中间模态, 四流骨干网络, 跨模态重识别, 参数共享

Abstract:

Most cameras are equipped with infrared and visible light functions. Therefore, the application of re-identification methods will inevitably solve the problem of cross-modality person re-identification. To reduce the difference between infrared and visible light modes in cross-modality person re-identification and improve recognition accuracy, a four-stream cross-modality person re-identification method based on dual-intermediate modalities is proposed. Two lightweight networks generate dual-intermediate modalities images of visible light and infrared modes, respectively, inherit labels from visible light and infrared images, and reconstruct a network suitable for learning shared features of four modalities by splitting ResNet50 backbone network. Additionally, the problem of parameter sharing in four-stream networks is also explored, and the impact of the number of four modalities shared blocks on cross-modality person re-identification is analyzed. The experimental results show that when compared to HcTri, the proposed method increases Rank-1 and mAP by 2.38 and 4.64 percentage points, respectively, in global search mode on the SYSU-MM01 dataset, 6.24 and 6.77 percentage points, respectively, in indoor search mode.Compared to HcTri, the proposed method increases Rank-1, mAP and mINP by 2.52, 3.74, and 4.68 percentage points, respectively, in visible light to infrared search mode on the RegDB dataset, in the infrared to visible light search mode, Rank-1, mAP, and mINP increase by 2.70, 3.47, and 5.56 percentage points.

Key words: person re-identification, dual-intermediate modalities, four-stream backbone network, cross-modality re-identification, parameter sharing