Abstract
Aiming at the problem of low accuracy of person re-identification (Re-ID) algorithm caused by occlusion, low distinctiveness of person features and unclear detail features in complex environment, we propose a Re-ID method based on fine-grained feature fusion and self-attention mechanism. First, we design a dilated non-local module (DNLM), which combines dilated convolution with the non-local module and embeds it between layers of the backbone network, enhancing the self-attention and receptive field of the model and improving the performance on occlusion tasks. Second, the fine-grained feature fusion screening module (3FSM) is improved based on the outlook attention module, which can realize adaptive feature selection and enhance the recognition ability to similar samples of the model. Finally, combined with the feature pyramid in the field of object detection, we propose a multi-scale feature fusion pyramid (MFFP) to improve the Re-ID tasks, in which we use different levels of features to perform feature enhancement. Ablation and comprehensive experiment results based on multiple datasets validate the effectiveness of our proposal. The mean Average Precision (mAP) of Market1501 and DukeMTMC-reID is 92.5 and 87.7%, and Rank-1 is 95.1 and 91.1% respectively. Compared with the current mainstream Re-ID algorithm, our method has excellent Re-ID performance.
Similar content being viewed by others
Data availability
The datasets generated during and analysed during the current study are available from the public resources: Market-1051 dataset: https://drive.google.com/file/d/0B8-rUzbwVRk0c054eEozWG9COHM/view. DukeMTMC-reID dataset: https://drive.google.com/file/d/1jjE85dRCMOgRtvJ5RQV9-Afs-2_5dY3O/view.
References
Zajdel W, Zivkovic Z, Krose BJ (2005) Keeping track of humans: Have i seen this person before? In: Proceedings of the 2005 IEEE international conference on robotics and automation, pp. 2081–2086. IEEE
Mittal H, Tripathi AK, Pandey AC, Venu P, Menon VG, Pal R (2022) A novel fuzzy clustering-based method for human activity recognition in cloud-based industrial IoT environment. Wirel Netw. 1–13
Fang H-S, Li J, Tang H, Xu C, Zhu H, Xiu Y, Li Y-L, Lu C (2022) Alphapose: whole-body regional multi-person pose estimation and tracking in real-time. IEEE Trans Pattern Anal Mach Intell 45(6):7157–7173
Remigereau F, Mekhazni D, Abdoli S, Cruz RM, Granger E, et al (2022) Knowledge distillation for multi-target domain adaptation in real-time person re-identification. In: 2022 IEEE International Conference on Image Processing (ICIP), pp. 3853–3557. IEEE
Hao Z, Ge H, Huang J (2023) Research on an unsupervised person re-identification based on image quality enhancement method. Eng Appl Artif Intell 123:106392
Chen H, Ihnatsyeva SA, Bohush RP, Ablameyko SV (2023) Person re-identification in video surveillance systems using deep learning: analysis of the existing methods. Autom Remote Control 84(5):497–528
Ning C, Menglu L, Hao Y, Xueping S, Yunhong L (2021) Survey of pedestrian detection with occlusion. Complex Intell Syst 7(1):577–587
Zhuo J, Chen Z, Lai J, Wang G (2018) Occluded person re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME), pp. 1–6. IEEE
Khamis S, Kuo C-H, Singh VK, Shet VD, Davis LS (2014) Joint learning for attribute-consistent person re-identification. In: European conference on computer vision, pp. 134–146. Springer
Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3908–3916
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1249–1258
Yuan L, Hou Q, Jiang Z, Feng J, Yan S (2022) Volo: vision outlooker for visual recognition. IEEE Trans Pattern Anal Mach Intell 45(5):6575–6586
Bedagkar-Gala A, Shah SK (2014) A survey of approaches and trends in person re-identification. Image Vis Comput 32(4):270–286. https://doi.org/10.1016/j.imavis.2014.02.001
Ming Z, Zhu M, Wang X, Zhu J, Cheng J, Gao C, Yang Y, Wei X (2022) Deep learning-based person re-identification methods: a survey and outlook of recent works. Image Vis Comput 119:104394. https://doi.org/10.1016/j.imavis.2022.104394
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR)
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: The conference on computer vision and pattern recognition
Shen J, Sun J, Wang X, Mao Z (2022) Joint metric learning of local and global features for vehicle re-identification. Complex Intell Syst 8(5):4005–4020
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp. 3754–3762
Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp. 1983–1991
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision, pp. 499–515. Springer
Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification. arXiv preprint arXiv: 1710.00478. https://arxiv.org/abs/1710.00478
Yu Z, Huang Z, Qin W, Guan T, Zhong Y, Sun D (2022) Joint uneven channel information network with blend metric loss for person re-identification. Complex Intell Syst 8(5):4163–4175
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp. 480–496
Zadeh A, Chen M, Poria S, Cambria E, Morency LP (2017) Tensor fusion network for multimodal sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing
Zhou K, Yang Y, Cavallaro A, Xiang T (2020) Omni-scale feature learning for person re-identification. In: 2019 IEEE/CVF international conference on computer vision (ICCV)
Chen W, Lu Y, Ma H, Chen Q, Wu X, Wu P (2022) Self-attention mechanism in person re-identification models. Multimed Tools Appl 81:1–19
Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R (2020) Resnest: slit-attention networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2736–2746
Chi S, Li J, Zhang S, Xing J, Qi T (2017) Pose-driven deep convolutional model for person re-identification. In: 2017 IEEE international conference on computer vision (ICCV)
Chen Y, Zhu X, Gong S (2019) Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 232–242
Tian M, Shuai Y, Li H, Li S, Wang X (2018) Eliminating background-bias for robust person re-identification. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Zadeh AB, Liang PP, Poria S, Cambria E, Morency L-P (2018) Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 2236–2246
Sahu G, Vechtomova O (2021) Adaptive fusion techniques for multimodal data. In: Conference of the European chapter of the association for computational linguistics
Chaib S, Liu H, Gu Y, Yao H (2017) Deep feature fusion for VHR remote sensing scene classification. IEEE Trans Geosci Remote Sens 55(8):4775–4784
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N (2020) Cross-modality person re-identification with shared-specific feature transfer. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Qin Q, Hu W, Liu B (2020) Feature projection for improved text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8161–8171. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.726.https://aclanthology.org/2020.acl-main.726
Wang X, Girshick RB, Gupta A, He K (2017) Non-local neural networks. arXiv: abs/1711.07971
Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv: 1611.01603. https://arxiv.org/abs/1611.01603
Yang Z, Yang D, Dyer C, He X, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American Chapter of the association for computational linguistics: human language technologies
Nam H, Ha JW, Kim J (2016) Dual attention networks for multimodal reasoning and matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 299–307
Abolghasemi P, Mazaheri A, Shah M, Boloni L (2019) Pay attention!—robustifying a deep visuomotor policy through task-focused visual attention. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Zhao D, Chen Y, Lv L (2017) Deep reinforcement learning with visual attention for vehicle classification. IEEE Trans Cogn Dev Syst 9(4):356–367
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2019) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11534–11542
Yuan Y, Wang J (2018) Ocnet: object context network for scene parsing
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. Adv Neural Inf Process Syst 31:1–11
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp. 1116–1124
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44(6):2872–2893
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. pp. 618–626
Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2020) Joint discriminative and generative learning for person re-identification. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Jia M, Cheng X, Lu S, Zhang J (2021) Learning disentangled representation implicitly via transformer for occluded person re-identification. IEEE Trans Multimedia 25:1294–1305
Zhuang Z, Wei L, Xie L, Zhang T, Zhang H, Wu H, Ai H, Tian Q (2020) Rethinking the distribution gap of person re-identification with camera-based batch normalization. In: European conference on computer vision
Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609
Sun Y, Zheng L, Li Y, Yang Y, Tian Q, Wang S (2021) Learning part-based convolutional features for person re-identification. IEEE Trans Pattern Anal Mach Intell 43(3):902–917
Wang M, Lai B, Huang J, Gong X, Hua XS (2021) Camera-aware proxies for unsupervised person re-identification. In: National conference on artificial intelligence
Zheng K, Liu W, He L, Mei T, Luo J, Zha ZJ (2021) Group-aware label transfer for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5310–5319
Chen H, Wang Y, Lagadec B, Dantcheva A, Bremond F (2020) Joint generative and contrastive learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2004–2013
Xuan S, Zhang S (2021) Intra-inter camera similarity for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11926–11935
Zheng Z, Wang X, Zheng N, Yang Y (2022) Parameter-efficient person re-identification in the 3d space. IEEE Trans Neural Netw Learn Syst 1–14
Chen Y, Duffner S, Stoian A, Dufour J-Y, Baskurt A (2021) List-wise learning-to-rank with convolutional neural networks for person re-identification. Mach Vis Appl 32(2):1–14
Ni X, Fang L, Huttunen H (2021) Adaptive l2 regularization in person re-identification. In: 2020 25th international conference on pattern recognition (ICPR)
Ye H, Liu H, Meng F, Li X (2020) Bi-directional exponential angular triplet loss for rgb-infrared person re-identification. IEEE Trans Image Process 30:1583–1595
Nguyen BX, Nguyen BD, Do T, Tjiputra E, Tran QD, Nguyen A (2021) Graph-based person signature for person re-identifications. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3492–3501
He S, Luo H, Wang P, Wang F, Jiang W (2021) Transreid: transformer-based object re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 15013–15022
Rao Y, Chen G, Lu J, Zhou J (2021) Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1025–1034
Chen X, Liu X, Liu W, Zhang XP, Zhang Y, Mei T (2021) Explainable person re-identification with attribute-guided metric distillation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 11813–11822
Acknowledgements
This work was sponsored by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (No.2022D01B187 and No.2022D01B05) and Shenzhen Science and Technology Program (No.JSGG20220301090405009).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yin, K., Ding, Z., Dong, Z. et al. Person re-identification method based on fine-grained feature fusion and self-attention mechanism. Computing (2024). https://doi.org/10.1007/s00607-024-01270-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00607-024-01270-5