Skip to main content
Log in

Person re-identification method based on fine-grained feature fusion and self-attention mechanism

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

Aiming at the problem of low accuracy of person re-identification (Re-ID) algorithm caused by occlusion, low distinctiveness of person features and unclear detail features in complex environment, we propose a Re-ID method based on fine-grained feature fusion and self-attention mechanism. First, we design a dilated non-local module (DNLM), which combines dilated convolution with the non-local module and embeds it between layers of the backbone network, enhancing the self-attention and receptive field of the model and improving the performance on occlusion tasks. Second, the fine-grained feature fusion screening module (3FSM) is improved based on the outlook attention module, which can realize adaptive feature selection and enhance the recognition ability to similar samples of the model. Finally, combined with the feature pyramid in the field of object detection, we propose a multi-scale feature fusion pyramid (MFFP) to improve the Re-ID tasks, in which we use different levels of features to perform feature enhancement. Ablation and comprehensive experiment results based on multiple datasets validate the effectiveness of our proposal. The mean Average Precision (mAP) of Market1501 and DukeMTMC-reID is 92.5 and 87.7%, and Rank-1 is 95.1 and 91.1% respectively. Compared with the current mainstream Re-ID algorithm, our method has excellent Re-ID performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The datasets generated during and analysed during the current study are available from the public resources: Market-1051 dataset: https://drive.google.com/file/d/0B8-rUzbwVRk0c054eEozWG9COHM/view. DukeMTMC-reID dataset: https://drive.google.com/file/d/1jjE85dRCMOgRtvJ5RQV9-Afs-2_5dY3O/view.

References

  1. Zajdel W, Zivkovic Z, Krose BJ (2005) Keeping track of humans: Have i seen this person before? In: Proceedings of the 2005 IEEE international conference on robotics and automation, pp. 2081–2086. IEEE

  2. Mittal H, Tripathi AK, Pandey AC, Venu P, Menon VG, Pal R (2022) A novel fuzzy clustering-based method for human activity recognition in cloud-based industrial IoT environment. Wirel Netw. 1–13

  3. Fang H-S, Li J, Tang H, Xu C, Zhu H, Xiu Y, Li Y-L, Lu C (2022) Alphapose: whole-body regional multi-person pose estimation and tracking in real-time. IEEE Trans Pattern Anal Mach Intell 45(6):7157–7173

    Article  Google Scholar 

  4. Remigereau F, Mekhazni D, Abdoli S, Cruz RM, Granger E, et al (2022) Knowledge distillation for multi-target domain adaptation in real-time person re-identification. In: 2022 IEEE International Conference on Image Processing (ICIP), pp. 3853–3557. IEEE

  5. Hao Z, Ge H, Huang J (2023) Research on an unsupervised person re-identification based on image quality enhancement method. Eng Appl Artif Intell 123:106392

    Article  Google Scholar 

  6. Chen H, Ihnatsyeva SA, Bohush RP, Ablameyko SV (2023) Person re-identification in video surveillance systems using deep learning: analysis of the existing methods. Autom Remote Control 84(5):497–528

    Article  Google Scholar 

  7. Ning C, Menglu L, Hao Y, Xueping S, Yunhong L (2021) Survey of pedestrian detection with occlusion. Complex Intell Syst 7(1):577–587

    Article  Google Scholar 

  8. Zhuo J, Chen Z, Lai J, Wang G (2018) Occluded person re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME), pp. 1–6. IEEE

  9. Khamis S, Kuo C-H, Singh VK, Shet VD, Davis LS (2014) Joint learning for attribute-consistent person re-identification. In: European conference on computer vision, pp. 134–146. Springer

  10. Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3908–3916

  11. Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1249–1258

  12. Yuan L, Hou Q, Jiang Z, Feng J, Yan S (2022) Volo: vision outlooker for visual recognition. IEEE Trans Pattern Anal Mach Intell 45(5):6575–6586

    Google Scholar 

  13. Bedagkar-Gala A, Shah SK (2014) A survey of approaches and trends in person re-identification. Image Vis Comput 32(4):270–286. https://doi.org/10.1016/j.imavis.2014.02.001

    Article  Google Scholar 

  14. Ming Z, Zhu M, Wang X, Zhu J, Cheng J, Gao C, Yang Y, Wei X (2022) Deep learning-based person re-identification methods: a survey and outlook of recent works. Image Vis Comput 119:104394. https://doi.org/10.1016/j.imavis.2022.104394

    Article  Google Scholar 

  15. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR)

  16. Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: The conference on computer vision and pattern recognition

  17. Shen J, Sun J, Wang X, Mao Z (2022) Joint metric learning of local and global features for vehicle re-identification. Complex Intell Syst 8(5):4005–4020

    Article  Google Scholar 

  18. Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp. 3754–3762

  19. Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp. 1983–1991

  20. Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision, pp. 499–515. Springer

  21. Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification. arXiv preprint arXiv: 1710.00478. https://arxiv.org/abs/1710.00478

  22. Yu Z, Huang Z, Qin W, Guan T, Zhong Y, Sun D (2022) Joint uneven channel information network with blend metric loss for person re-identification. Complex Intell Syst 8(5):4163–4175

    Article  Google Scholar 

  23. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp. 480–496

  24. Zadeh A, Chen M, Poria S, Cambria E, Morency LP (2017) Tensor fusion network for multimodal sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing

  25. Zhou K, Yang Y, Cavallaro A, Xiang T (2020) Omni-scale feature learning for person re-identification. In: 2019 IEEE/CVF international conference on computer vision (ICCV)

  26. Chen W, Lu Y, Ma H, Chen Q, Wu X, Wu P (2022) Self-attention mechanism in person re-identification models. Multimed Tools Appl 81:1–19

    Google Scholar 

  27. Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R (2020) Resnest: slit-attention networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2736–2746

  28. Chi S, Li J, Zhang S, Xing J, Qi T (2017) Pose-driven deep convolutional model for person re-identification. In: 2017 IEEE international conference on computer vision (ICCV)

  29. Chen Y, Zhu X, Gong S (2019) Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 232–242

  30. Tian M, Shuai Y, Li H, Li S, Wang X (2018) Eliminating background-bias for robust person re-identification. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  31. Zadeh AB, Liang PP, Poria S, Cambria E, Morency L-P (2018) Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 2236–2246

  32. Sahu G, Vechtomova O (2021) Adaptive fusion techniques for multimodal data. In: Conference of the European chapter of the association for computational linguistics

  33. Chaib S, Liu H, Gu Y, Yao H (2017) Deep feature fusion for VHR remote sensing scene classification. IEEE Trans Geosci Remote Sens 55(8):4775–4784

    Article  Google Scholar 

  34. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  35. Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N (2020) Cross-modality person re-identification with shared-specific feature transfer. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  36. Qin Q, Hu W, Liu B (2020) Feature projection for improved text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8161–8171. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.726.https://aclanthology.org/2020.acl-main.726

  37. Wang X, Girshick RB, Gupta A, He K (2017) Non-local neural networks. arXiv: abs/1711.07971

  38. Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv: 1611.01603. https://arxiv.org/abs/1611.01603

  39. Yang Z, Yang D, Dyer C, He X, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American Chapter of the association for computational linguistics: human language technologies

  40. Nam H, Ha JW, Kim J (2016) Dual attention networks for multimodal reasoning and matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 299–307

  41. Abolghasemi P, Mazaheri A, Shah M, Boloni L (2019) Pay attention!—robustifying a deep visuomotor policy through task-focused visual attention. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  42. Zhao D, Chen Y, Lv L (2017) Deep reinforcement learning with visual attention for vehicle classification. IEEE Trans Cogn Dev Syst 9(4):356–367

    Article  Google Scholar 

  43. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2019) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11534–11542

  44. Yuan Y, Wang J (2018) Ocnet: object context network for scene parsing

  45. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778

  46. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737

  47. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. Adv Neural Inf Process Syst 31:1–11

    Google Scholar 

  48. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125

  49. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp. 1116–1124

  50. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44(6):2872–2893

    Article  Google Scholar 

  51. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. pp. 618–626

  52. Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2020) Joint discriminative and generative learning for person re-identification. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  53. Jia M, Cheng X, Lu S, Zhang J (2021) Learning disentangled representation implicitly via transformer for occluded person re-identification. IEEE Trans Multimedia 25:1294–1305

    Article  Google Scholar 

  54. Zhuang Z, Wei L, Xie L, Zhang T, Zhang H, Wu H, Ai H, Tian Q (2020) Rethinking the distribution gap of person re-identification with camera-based batch normalization. In: European conference on computer vision

  55. Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609

    Article  Google Scholar 

  56. Sun Y, Zheng L, Li Y, Yang Y, Tian Q, Wang S (2021) Learning part-based convolutional features for person re-identification. IEEE Trans Pattern Anal Mach Intell 43(3):902–917

    Article  Google Scholar 

  57. Wang M, Lai B, Huang J, Gong X, Hua XS (2021) Camera-aware proxies for unsupervised person re-identification. In: National conference on artificial intelligence

  58. Zheng K, Liu W, He L, Mei T, Luo J, Zha ZJ (2021) Group-aware label transfer for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5310–5319

  59. Chen H, Wang Y, Lagadec B, Dantcheva A, Bremond F (2020) Joint generative and contrastive learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2004–2013

  60. Xuan S, Zhang S (2021) Intra-inter camera similarity for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11926–11935

  61. Zheng Z, Wang X, Zheng N, Yang Y (2022) Parameter-efficient person re-identification in the 3d space. IEEE Trans Neural Netw Learn Syst 1–14

  62. Chen Y, Duffner S, Stoian A, Dufour J-Y, Baskurt A (2021) List-wise learning-to-rank with convolutional neural networks for person re-identification. Mach Vis Appl 32(2):1–14

    Article  Google Scholar 

  63. Ni X, Fang L, Huttunen H (2021) Adaptive l2 regularization in person re-identification. In: 2020 25th international conference on pattern recognition (ICPR)

  64. Ye H, Liu H, Meng F, Li X (2020) Bi-directional exponential angular triplet loss for rgb-infrared person re-identification. IEEE Trans Image Process 30:1583–1595

    Article  Google Scholar 

  65. Nguyen BX, Nguyen BD, Do T, Tjiputra E, Tran QD, Nguyen A (2021) Graph-based person signature for person re-identifications. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3492–3501

  66. He S, Luo H, Wang P, Wang F, Jiang W (2021) Transreid: transformer-based object re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 15013–15022

  67. Rao Y, Chen G, Lu J, Zhou J (2021) Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1025–1034

  68. Chen X, Liu X, Liu W, Zhang XP, Zhang Y, Mei T (2021) Explainable person re-identification with attribute-guided metric distillation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 11813–11822

Download references

Acknowledgements

This work was sponsored by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (No.2022D01B187 and No.2022D01B05) and Shenzhen Science and Technology Program (No.JSGG20220301090405009).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Guangqiang Yin or Zhiguo Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, K., Ding, Z., Dong, Z. et al. Person re-identification method based on fine-grained feature fusion and self-attention mechanism. Computing (2024). https://doi.org/10.1007/s00607-024-01270-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00607-024-01270-5

Keywords

Mathematics Subject Classification

Navigation