Abstract
Person re-identification (ReID) aims to match a specific person across non-overlapping camera views and has wide application prospects. However, existing methods are still susceptible to occlusion and missing critical parts. Most methods fuse low-level detail features and high-level strong semantic features using feature concatenation or addition, leading to useful information being overwhelmed by a large amount of useless information. In addition, many methods extract spatial context features by designing different blocks but ignore the local channel context features. To relieve these issues, this paper presents an improved interaction-and-aggregation network (IIANet) to learn more representative feature representation. First, to improve model robustness to serious occlusion or missing crucial parts of the target person, we employ a global multi-scale module (MSM) to extract multi-scale features by multi-branch convolution and hierarchical residual connection. Second, to selectively fuse low-level detail features and high-level semantic features effectively, we design a gated fully fusion module (GFFM) to control information transmission and reduce feature interferences in fusing different-level features. Finally, we adopt a channel context module (CCM) to learn channel context information via multi-scale local fusion. Sufficient experiments demonstrate the better performances of our IIANet on dataset Market-1501. The mAP and Rank-1 accuracy of our model reach 84.9% and 94.2%, respectively. Our code is available at: https://gitee.com/bingsfan/iianet/tree/master/
Similar content being viewed by others
Data availability
The original datasets have been published online. The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Gao SH, Cheng MM, Zhao K et al (2019) Res2net: A new multi-scale backbone architecture[J]. IEEE Trans Pattern Anal Mach Intell 43(2):652–662
Hu J, Shen L, Albanie S et al (2020) Squeeze-and-Excitation Networks[J]. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Kim G, Shu DW, Kwon J (2021) Robust person re-identification via graph convolution networks[J]. Multimedia Tools and Applications 80(19):29129–29138
Li Y, Zhang B, Sun J et al (2021) Person re-identification based on activation guided identity and attribute classification model[J]. Multimedia Tools and Applications 80(10):14961–14977
Minoofam SAH, Bastanfard A, Keyvanpour MR (2022) TRCLA: a transfer learning approach to reduce negative transfer for cellular learning automata[J]. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3106705
Modhej N, Bastanfard A, Teshnehlab M et al (2020) Pattern separation network based on the hippocampus activity for handwritten recognition[J]. IEEE Access 8:212803–212817
Shen C, Qi GJ, Jiang R et al (2018) Sharp attention network via adaptive sampling for person re-identification[J]. IEEE Trans Circuits Syst Video Technol 29(10):3016–3027
Tao H, Lu M, Hu Z, Xin Z, Wang J (2021) Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition[J]. IEEE Trans Industr Inf 8(11):7653–7664
Tao H, Xie C, Wang J, Xin Z (2022) CENet: A channel-enhanced spatiotemporal network with sufficient supervision information for recognizing industrial smoke emissions[J]. IEEE Internet Things J 9(19):18749–18759
Zheng L, Huang Y, Lu H et al (2019) Pose-invariant embedding for deep person re-identification[J]. IEEE Trans Image Process 28(9):4500–4509
Zhu F, Kong X, Wu Q et al (2018) A loss combination based deep model for person re-identification[J]. Multimedia Tools and Applications 77(3):3049–3069
Zheng L, Shen L, Tian L et al (2015) Scalable person re-identification: A benchmark[C]. In CVPR. 1116–1124
Wang G, Yang S, Liu H, et al (2020) High-order information matters: Learning relation and topology for occluded person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C]. 6449–6458
He L, Wang Y, Liu W, et al (2019) Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification[A]. In: International Conference on Computer Vision[C] 8450–8459
Hou R, Ma B, Chang H, et al (2019) Interaction-and-aggregation network for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 9317–9326
Cao Y, Xu J, Lin S, et al (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF international conference on computer vision workshops. 1–10
Li Z, Sun Y, Tang J (2021) CTNet: Context-based Tandem Network for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 265–276
Selvaraju R, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization[A]. In: IEEE International Conference on Computer Vision[C] 618–626
Kalayeh M, Basaran E, Gokmen M, et al. (2018) Human semantic parsing for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1062–1071
Yang W, Huang H, Zhang Z, et al (2019) Towards rich feature discovery with class activation maps augmentation for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1389–1398
Zhong Z, Zheng L, Luo Z, et al (2019) Invariance matters: Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 598–607
Gao S, Wang J, Lu H, et al (2020) Pose-guided visible part matching for occluded person reid[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 11744–11752
Liu Z, Qin J, Li A, et al (2019) Adversarial binary coding for efficient person re-identification[A]. In: IEEE International Conference on Multimedia and Expo[C] 700–705
Wang G, Lai J, Huang P, et al 2019 Spatial-temporal person re-identification[A]. In: AAAI Conference on Artificial Intelligence[C] 8933–8940
Cho K, Merrienboer B, Gulcehre C, et al (2020) Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J] arXiv preprint, arXiv:1406.1078
Tao H, Duan Q, (2022) Learning Discriminative Feature Representation for Estimating Smoke Density of Smoky Vehicle Rear[J]. IEEE Transactions on Intelligent Transportation Systems, 1–12
Xu D, Ouyang W, Wang X, et al (2018) Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 675–684
Zhang D, Zhang H, Tang J, et al (2021) Selfregulation for semantic segmentation[A]. In: IEEE International Conference on Computer Vision[C] 6953–6963
Zhao H, Shi J, Qi X, et al (2017) Pyramid scene parsing network[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 2881–2890
Hou Q, Zhang L, Cheng M, et al (2020) Strip pooling: Rethinking spatial pooling for scene parsing[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 4003–4012
Ke T, Hwang J, Liu Z, et al (2018) Adaptive affinity fields for semantic segmentation[A]. In: European Conference on Computer Vision[C] 587–602
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778
Zhao Z, Liu Q, Wang S (2021) Learning Deep Global Multi-Scale and Local Attention Features for Facial Expression Recognition in the Wild [J]. IEEE Transactions on Image Processing, 6544–6556
Li X, Zhao H, Han L, et al (2020) Gated Fully Fusion for Semantic Segmentation[A]. In: AAAI Conference on Artificial Intelligence[C] 11418–11425
Jin X, Lan C, Zeng W, et al (2020) Style normalization and restitution for generalizable person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 3143–3152
Sun Y, Xu Q, Li Y, et al (2020) Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 393–402
Zhuang Z, Wei L, Xie L, et al (2020) Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization[A]. In: European Conference on Computer Vision[C] 140–157
Song C, Huang Y, Ouyang W, et al (2018) Mask-guided contrastive attention model for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1179–1188
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification[A]. In: IEEE International Conference on Multimedia and Expo[C] 2285–2294
Si J, Zhang H, Li C, et al (2018) Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 5363–5372
Wang C, Zhang Q, Huang C, et al (2018) Mancs: A multi-task attentional network with curriculum sampling for person re-identification[A]. In: European Conference on Computer Vision[C] 365–381
Tay C, Roy S, Yap K (2019) AANet: Attribute attention network for person re-identifications[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 7134–7143
Zhong Z, Zheng L, Zheng Z, et al (2018) Camera style adaptation for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition 5157–5166
Qi L, Huo J, Wang L, et al (2019) A mask based deep ranking neural network for person retrieval[C]//2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE 496–501
Fan X, Luo H, Zhang X, et al (2018) Scpnet: Spatial-channel parallelism network for joint holistic and partial person re-identification[C]//Asian conference on computer vision. Springer, Cham, 19–34
Sun H, Chen Z, Yan S, et al (2019) Mvp matching: A maximum-value perfect matching for mining hard samples, with application to person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 6737–6747
Shen Y, Li H, Yi S, et al (2018) Person re-identification with deep similarity-guided graph neural network[C]//Proceedings of the European conference on computer vision (ECCV). 486–504
Zhou K, Yang Y, Cavallaro A, et al (2019) Omni-scale feature learning for person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 3702–3712
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2285–2294
Miao J, Wu Y, Liu P, et al (2019) Pose-guided feature alignment for occluded person re-identification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 542–551
Luo H, Gu Y, Liao X, et al (2019) Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0–0
Ge Y, Li Z, Zhao H, et al (2018) Fd-gan: Pose-guided feature distilling gan for robust person re-identification[J]. Advances in neural information processing systems, 31
Ge W (2018) Deep metric learning with hierarchical triplet loss[C]//Proceedings of the European Conference on Computer Vision (ECCV). 269–285
He L, Liang J, Li H, et al (2018) Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 7073–7082
Bastanfard A, Amirkhani D, Mohammadi M (2022) Toward image super-resolution based on local regression and nonlocal means[J]. Multimedia Tools and Applications, 1–20.
Acknowledgements
This work was partly supported by the National Natural Science Foundation of China (No. 62102320), and the Fundamental Research Funds for the Central Universities (No. D5000210737)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tao, H., Bao, W., Duan, Q. et al. An improved interaction-and-aggregation network for person re-identification. Multimed Tools Appl 82, 44053–44069 (2023). https://doi.org/10.1007/s11042-023-15531-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15531-6