Skip to main content
Log in

An improved interaction-and-aggregation network for person re-identification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Person re-identification (ReID) aims to match a specific person across non-overlapping camera views and has wide application prospects. However, existing methods are still susceptible to occlusion and missing critical parts. Most methods fuse low-level detail features and high-level strong semantic features using feature concatenation or addition, leading to useful information being overwhelmed by a large amount of useless information. In addition, many methods extract spatial context features by designing different blocks but ignore the local channel context features. To relieve these issues, this paper presents an improved interaction-and-aggregation network (IIANet) to learn more representative feature representation. First, to improve model robustness to serious occlusion or missing crucial parts of the target person, we employ a global multi-scale module (MSM) to extract multi-scale features by multi-branch convolution and hierarchical residual connection. Second, to selectively fuse low-level detail features and high-level semantic features effectively, we design a gated fully fusion module (GFFM) to control information transmission and reduce feature interferences in fusing different-level features. Finally, we adopt a channel context module (CCM) to learn channel context information via multi-scale local fusion. Sufficient experiments demonstrate the better performances of our IIANet on dataset Market-1501. The mAP and Rank-1 accuracy of our model reach 84.9% and 94.2%, respectively. Our code is available at: https://gitee.com/bingsfan/iianet/tree/master/

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The original datasets have been published online. The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Gao SH, Cheng MM, Zhao K et al (2019) Res2net: A new multi-scale backbone architecture[J]. IEEE Trans Pattern Anal Mach Intell 43(2):652–662

    Article  Google Scholar 

  2. Hu J, Shen L, Albanie S et al (2020) Squeeze-and-Excitation Networks[J]. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023

    Article  Google Scholar 

  3. Kim G, Shu DW, Kwon J (2021) Robust person re-identification via graph convolution networks[J]. Multimedia Tools and Applications 80(19):29129–29138

    Article  Google Scholar 

  4. Li Y, Zhang B, Sun J et al (2021) Person re-identification based on activation guided identity and attribute classification model[J]. Multimedia Tools and Applications 80(10):14961–14977

    Article  Google Scholar 

  5. Minoofam SAH, Bastanfard A, Keyvanpour MR (2022) TRCLA: a transfer learning approach to reduce negative transfer for cellular learning automata[J]. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3106705

    Article  Google Scholar 

  6. Modhej N, Bastanfard A, Teshnehlab M et al (2020) Pattern separation network based on the hippocampus activity for handwritten recognition[J]. IEEE Access 8:212803–212817

    Article  Google Scholar 

  7. Shen C, Qi GJ, Jiang R et al (2018) Sharp attention network via adaptive sampling for person re-identification[J]. IEEE Trans Circuits Syst Video Technol 29(10):3016–3027

    Article  Google Scholar 

  8. Tao H, Lu M, Hu Z, Xin Z, Wang J (2021) Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition[J]. IEEE Trans Industr Inf 8(11):7653–7664

    Article  Google Scholar 

  9. Tao H, Xie C, Wang J, Xin Z (2022) CENet: A channel-enhanced spatiotemporal network with sufficient supervision information for recognizing industrial smoke emissions[J]. IEEE Internet Things J 9(19):18749–18759

    Article  Google Scholar 

  10. Zheng L, Huang Y, Lu H et al (2019) Pose-invariant embedding for deep person re-identification[J]. IEEE Trans Image Process 28(9):4500–4509

    Article  MathSciNet  MATH  Google Scholar 

  11. Zhu F, Kong X, Wu Q et al (2018) A loss combination based deep model for person re-identification[J]. Multimedia Tools and Applications 77(3):3049–3069

    Article  Google Scholar 

  12. Zheng L, Shen L, Tian L et al (2015) Scalable person re-identification: A benchmark[C]. In CVPR. 1116–1124

  13. Wang G, Yang S, Liu H, et al (2020) High-order information matters: Learning relation and topology for occluded person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C]. 6449–6458

  14. He L, Wang Y, Liu W, et al (2019) Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification[A]. In: International Conference on Computer Vision[C] 8450–8459

  15. Hou R, Ma B, Chang H, et al (2019) Interaction-and-aggregation network for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 9317–9326

  16. Cao Y, Xu J, Lin S, et al (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF international conference on computer vision workshops. 1–10

  17. Li Z, Sun Y, Tang J (2021) CTNet: Context-based Tandem Network for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 265–276

  18. Selvaraju R, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization[A]. In: IEEE International Conference on Computer Vision[C] 618–626

  19. Kalayeh M, Basaran E, Gokmen M, et al. (2018) Human semantic parsing for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1062–1071

  20. Yang W, Huang H, Zhang Z, et al (2019) Towards rich feature discovery with class activation maps augmentation for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1389–1398

  21. Zhong Z, Zheng L, Luo Z, et al (2019) Invariance matters: Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 598–607

  22. Gao S, Wang J, Lu H, et al (2020) Pose-guided visible part matching for occluded person reid[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 11744–11752

  23. Liu Z, Qin J, Li A, et al (2019) Adversarial binary coding for efficient person re-identification[A]. In: IEEE International Conference on Multimedia and Expo[C] 700–705

  24. Wang G, Lai J, Huang P, et al 2019 Spatial-temporal person re-identification[A]. In: AAAI Conference on Artificial Intelligence[C] 8933–8940

  25. Cho K, Merrienboer B, Gulcehre C, et al (2020) Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J] arXiv preprint, arXiv:1406.1078

  26. Tao H, Duan Q, (2022) Learning Discriminative Feature Representation for Estimating Smoke Density of Smoky Vehicle Rear[J]. IEEE Transactions on Intelligent Transportation Systems, 1–12

  27. Xu D, Ouyang W, Wang X, et al (2018) Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 675–684

  28. Zhang D, Zhang H, Tang J, et al (2021) Selfregulation for semantic segmentation[A]. In: IEEE International Conference on Computer Vision[C] 6953–6963

  29. Zhao H, Shi J, Qi X, et al (2017) Pyramid scene parsing network[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 2881–2890

  30. Hou Q, Zhang L, Cheng M, et al (2020) Strip pooling: Rethinking spatial pooling for scene parsing[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 4003–4012

  31. Ke T, Hwang J, Liu Z, et al (2018) Adaptive affinity fields for semantic segmentation[A]. In: European Conference on Computer Vision[C] 587–602

  32. He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778

  33. Zhao Z, Liu Q, Wang S (2021) Learning Deep Global Multi-Scale and Local Attention Features for Facial Expression Recognition in the Wild [J]. IEEE Transactions on Image Processing, 6544–6556

  34. Li X, Zhao H, Han L, et al (2020) Gated Fully Fusion for Semantic Segmentation[A]. In: AAAI Conference on Artificial Intelligence[C] 11418–11425

  35. Jin X, Lan C, Zeng W, et al (2020) Style normalization and restitution for generalizable person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 3143–3152

  36. Sun Y, Xu Q, Li Y, et al (2020) Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 393–402

  37. Zhuang Z, Wei L, Xie L, et al (2020) Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization[A]. In: European Conference on Computer Vision[C] 140–157

  38. Song C, Huang Y, Ouyang W, et al (2018) Mask-guided contrastive attention model for person re-identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 1179–1188

  39. Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification[A]. In: IEEE International Conference on Multimedia and Expo[C] 2285–2294

  40. Si J, Zhang H, Li C, et al (2018) Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 5363–5372

  41. Wang C, Zhang Q, Huang C, et al (2018) Mancs: A multi-task attentional network with curriculum sampling for person re-identification[A]. In: European Conference on Computer Vision[C] 365–381

  42. Tay C, Roy S, Yap K (2019) AANet: Attribute attention network for person re-identifications[A]. In: IEEE Conference on Computer Vision and Pattern Recognition[C] 7134–7143

  43. Zhong Z, Zheng L, Zheng Z, et al (2018) Camera style adaptation for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition 5157–5166

  44. Qi L, Huo J, Wang L, et al (2019) A mask based deep ranking neural network for person retrieval[C]//2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE 496–501

  45. Fan X, Luo H, Zhang X, et al (2018) Scpnet: Spatial-channel parallelism network for joint holistic and partial person re-identification[C]//Asian conference on computer vision. Springer, Cham, 19–34

  46. Sun H, Chen Z, Yan S, et al (2019) Mvp matching: A maximum-value perfect matching for mining hard samples, with application to person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 6737–6747

  47. Shen Y, Li H, Yi S, et al (2018) Person re-identification with deep similarity-guided graph neural network[C]//Proceedings of the European conference on computer vision (ECCV). 486–504

  48. Zhou K, Yang Y, Cavallaro A, et al (2019) Omni-scale feature learning for person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 3702–3712

  49. Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2285–2294

  50. Miao J, Wu Y, Liu P, et al (2019) Pose-guided feature alignment for occluded person re-identification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 542–551

  51. Luo H, Gu Y, Liao X, et al (2019) Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0–0

  52. Ge Y, Li Z, Zhao H, et al (2018) Fd-gan: Pose-guided feature distilling gan for robust person re-identification[J]. Advances in neural information processing systems, 31

  53. Ge W (2018) Deep metric learning with hierarchical triplet loss[C]//Proceedings of the European Conference on Computer Vision (ECCV). 269–285

  54. He L, Liang J, Li H, et al (2018) Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 7073–7082

  55. Bastanfard A, Amirkhani D, Mohammadi M (2022) Toward image super-resolution based on local regression and nonlocal means[J]. Multimedia Tools and Applications, 1–20.

Download references

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (No. 62102320), and the Fundamental Research Funds for the Central Universities (No. D5000210737)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huanjie Tao.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, H., Bao, W., Duan, Q. et al. An improved interaction-and-aggregation network for person re-identification. Multimed Tools Appl 82, 44053–44069 (2023). https://doi.org/10.1007/s11042-023-15531-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15531-6

Keywords

Navigation