Channel and Spatial Dual-Attention Network for Person Re-Identification

doi:10.19678/j.issn.1000-3428.0063136

Abstract

Abstract: To address the challenge in obtaining the discriminative features of pedestrians in actual scenes due to changes in camera angle, pedestrian postures, object occlusions, low image resolutions, and misaligned pedestrian images, a Hybrid Pooling Channel Attention Module(HPCAM) and a Full Pixel Spatial Attention Module (FPSAM) are designed.Based on these two attention modules, a Channel and Spatial Dual-Attention Network (CSDA-Net) is proposed.The HPCAM module suppresses the interference of meaningless information and enhances the expression of salient features in the channel dimension to extract highly discriminative pedestrian features.The FPSAM module enhances the discrimination ability of pedestrian features in the spatial dimension and then improves the accuracy of person Re-Identification(ReID).By integrating the HPCAM and FPSAM modules into the traditional person ReID depth model framework in stages, attention features ranging from rough to fine ones are obtained.Experimental results show that the Rank-1 accuracies of CSDA-Net on mainstream datasets CUHK03, DukeMTMC-ReID, and Market1501 in the field of person ReID are 78.3%, 91.3%, and 96.0%, respectively, and its mean Average Precision(mAP) values are 80.0%, 82.1%, and 90.4%, respectively.Compared with MGN networks, the three networks mentioned above show higher Rank-1 accuracies by 14.9, 2.6, and 0.3 percentage points, respectively, and higher mAP values by 13.7, 3.7, and 3.5 percentage points, respectively.This indicates that the CSDA-Net can extract more robust and discriminative expression features.

Key words: person Re-Identification(ReID), dual-attention mechanism, pedestrian features, deep learning, mean Average Precision(mAP)

摘要： 针对现实场景下因受到摄像机视角变化、行人姿态变化、物体遮挡、图像低分辨率、行人图片未对齐等因素影响导致行人判别性特征难以获取的问题，设计混合池通道注意模块（HPCAM）和全像素空间注意力模块（FPSAM），并基于这两种注意力模块提出一种通道与空间双重注意力网络（CSDA-Net）。HPCAM模块能够在通道维度上抑制无用信息的干扰，增强显著性特征的表达，以提取得到判别性强的行人特征。FPSAM模块在空间维度上增强行人特征的判别能力，从而提高行人重识别的准确率。通过在传统行人重识别深度模型框架中分阶段融入HPCAM模块和FPSAM模块，获得由粗糙到细粒度的注意力特征。实验结果表明，CSDA-Net网络在行人重识别主流数据集CUHK03、DukeMTMC-ReID和Market1501上的Rank-1准确率分别为78.3%、91.3%和96.0%，平均精度均值（mAP）分别为80.0%、82.1%和90.4%，与MGN网络相比，Rank-1准确率分别提升14.9、2.6和0.3个百分点，mAP分别提升13.7、3.7和3.5个百分点，能够提取更具鲁棒性和判别性的表达特征。

关键词: 行人重识别, 双重注意力机制, 行人特征, 深度学习, 平均精度均值

CLC Number:

TP391

ZENG Tao, XUE Feng, YANG Tian. Channel and Spatial Dual-Attention Network for Person Re-Identification[J]. Computer Engineering, 2022, 48(12): 281-287,295.

曾涛, 薛峰, 杨添. 面向行人重识别的通道与空间双重注意力网络[J]. 计算机工程, 2022, 48(12): 281-287,295.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0063136

http://www.ecice06.com/EN/Y2022/V48/I12/281

Figures/Tables 8

References

[1] YE M, SHEN J B, LIN G J, et al.Deep learning for person re-identification:a survey and outlook[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6):2872-2893.
[2] 吴彦丞, 陈鸿昶, 李邵梅, 等.基于行人属性异质性的行人再识别神经网络模型[J].计算机工程, 2018, 44(10):196-203. WU Y C, CHEN H C, LI S M, et al.Pedestrian re-identification neural network model based on pedestrian attribute heterogeneity[J].Computer Engineering, 2018, 44(10):196-203.(in Chinese)
[3] LI R, ZHANG B P, TENG Z, et al.A divide-and-unite deep network for person re-identification[J].Applied Intelligence, 2021, 51(3):1479-1491.
[4] 罗浩, 姜伟, 范星, 等.基于深度学习的行人重识别研究进展[J].自动化学报, 2019, 45(11):2032-2049. LUO H, JIANG W, FAN X, et al.A survey on deep learning based person re-identification[J].Acta Automatica Sinica, 2019, 45(11):2032-2049.(in Chinese)
[5] ZHENG L, YANG Y, HAUPTMANN A G.Personre-identification:past, present and future[EB/OL].[2021-10-09].https://arxiv.org/abs/1610.02984v1.
[6] HERMANS A, BEYER L, LEIBE B.In defense of the triplet loss for person re-identification[EB/OL].[2021-10-09].https://arxiv.org/abs/1703.07737.
[7] ZHENG Z D, ZHENG L, YANG Y.A discriminatively learned CNN embedding for person reidentification[J].ACM Transactions on Multimedia Computing, Communications, and Applications, 2018, 14(1):13-21.
[8] SUN Y F, ZHENG L, YANG Y, et al.Beyond part models:person retrieval with refined part pooling (and a strong convolutional baseline).[EB/OL].[202110-09].https://arxiv.org/abs/1711.09349.
[9] FU Y, WEI Y C, ZHOU Y Q, et al.Horizontal pyramid matching for person re-identification[C]//Proceedings of AAAI Conference on Artificial Intelligence.Hawaii, USA:AAAI Press, 2019:8295-8302.
[10] ZHAO H Y, TIAN M Q, SUN S Y, et al.Spindle net:person re-identification with human body region guided feature decomposition and fusion[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:907-915.
[11] LIU H, FENG J S, QI M B, et al.End-to-end comparative attention networks for person re-identification[J].IEEE Transactions on Image Processing, 2017, 26(7):3492-3506
[12] ZHENG M, KARANAM S, WU Z Y, et al.Re-identification with consistent attentive siamese networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:5728-5737.
[13] LI W, ZHU X T, GONG S G.Harmonious attention network for person re-identification[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:2285-2294.
[14] DAI Z Z, CHEN M Q, GU X D, et al.Batch dropblock network for person re-identification and beyond[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:3690-3700.
[15] YANG W J, HUANG H J, ZHANG Z, et al.Towards rich feature discovery with class activation maps augmentation for person re-identification[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:1389-1398.
[16] LUO H, GU Y Z, LIAO X Y, et al.Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Washington D.C., USA:IEEE Press, 2019:1487-1495.
[17] ULYANOV D, VEDALDI A, LEMPITSKY V.Instance normalization:the missing ingredient for fast stylization[EB/OL].[2021-10-09].https://arxiv.org/abs/1607.08022.
[18] PAN X G, LUO P, SHI J P, et al.Two at once:enhancing learning and generalization capacities via IBN-Net[[C]//Proceedings of the European Conference on Computer Vision.Berlin, Germany:Springer, 2018:464-479.
[19] HU J, SHEN L, ALBANIE S, et al.Squeeze-and-excitation networks[C]//Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence.Washington D.C., USA:IEEE Press, 2018:2011-2023.
[20] ZHENG L, SHEN L, LU T, et al.Scalable person re-identification:a benchmark[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1116-1124.
[21] RISTANI E, SOLERA F, ZOU R, et al.Performance measures and a data set for multi-target, multi-camera tracking[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2016:17-35.
[22] LI W, ZHAO R, XIAO T, et al.DeepReID:deep filter pairing neural network for person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:152-159.
[23] SUN Y F, ZHENG L, DENG W J, et al.SVDnet for pedestrian retrieval[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:3800-3808.
[24] WANG G S, YUAN Y F, CHEN X, et al.Learning discriminative features with multiple granularities for person re-identification[C]//Proceedings of the 26th ACM International Conference on Multimedia.New York, USA:ACM Press, 2018:274-282.
[25] ZHONG Z, ZHENG L, CAO D L, et al.Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:3652-3661.

Please choose a citation manager

Content to export