Abstract
Occluded person re-identification (Re-ID) task has been a long-standing challenge since occlusions inevitably lead to the deficiency of pedestrian information. Most existing methods tackle the challenge by employing auxiliary models, including pose estimation or graph matching models, to learn multi-scale or part-level features. However, the methods heavily rely on the external cues, the performance degrades when the target pedestrian is occluded severely or occluded by another pedestrian. This paper develops a novel Re-ID model single-scale robust feature representation (SRFR) to learn discriminative single-scale features without external cues. Specifically, a light-weight spatial memory module is investigated which takes the advantages of key-value memory network to store occlusion features and utilizes self-attention architecture to get fine-grained features. Furthermore, a camera-constrained triplet loss (CTL) function is exploited to mitigate the negative effects of different pedestrian samples under the same camera on the basis of the triplet loss. Experimental results show the SRFR achieves superior performance on both occluded and holistic datasets, which prove that single-scale features can also work well on mining discriminative features.
Similar content being viewed by others
Availability of data and material
The authors all make sure that all data and materials support our published claims and comply with field standards.
References
Yuan D, Chang X, Li Z, He Z (2021) Learning adaptive spatial-temporal context-aware correlation filters for UAV tracking. ACM Trans Multimed Comput Commun Appl 18(3):70:1-70:18
Yuan D, Shu X, Liu Q, Zhang X, He Z (2022) Robust thermal infrared tracking via an adaptively multi-feature fusion model. Neural Comput Appl:1–12
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2017) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision, pp 480–496
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1320–1329
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: MM Proceeding ACM multimedia conference on multimedia conference, pp 274–282
He S, Luo H, Wang P, Wang F, Li H, Jiang W (2021) TransReID: transformer-based object re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 15013–15022
Sun Y, Cheng C, Zhang Y, Zhang C, Zheng L, Wang Z, Wei Y (2020) Circle loss: a unified perspective of pair similarity optimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6398–6407
Miao J, Wu Y, Liu P, Ding Y, Yang Y (2019) Pose-guided feature alignment for occluded person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 542–551
Zhao L, Xi L, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3219–3228
He L, Wang Y, Liu W, Zhao H, Sun Z, Feng J (2019) Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 8450–8459
Wang G, Yang S, Liu H, Wang Z, Yang Y, Wang S, Yu G, Zhou E, Sun J (2020) High-order information matters: learning relation and topology for occluded person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6449–6458
Gao S, Wang J, Lu H, Liu Z (2020) Pose-guided visible part matching for occluded person reid. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11744–11752
Huang H, Li D, Zhang Z, Chen X, Huang K (2018) Adversarially occluded samples for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5098–5107
Sun Y, Xu Q, Li Y, Zhang C, Li Y, Wang S, Sun J (2019) Perceive where to focus: learning visibility-aware part-level features for partial person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 393–402
Wang Z, Zhu F, Tang S, Zhao R, He L, Song J (2022) Feature erasing and diffusion network for occluded person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4754–4763
Yan C, Pang G, Jiao J, Bai X, Feng X, Shen C (2021) Occluded person re-identification with single-scale global representations. In: Proceedings of the IEEE international conference on computer vision, pp 11855–11864
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 599–608
Liu Z, Mao H, Wu C. Y, Feichtenhofer C, Darrell T, Xie S (2022) A convnet for the 2020s. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11976–11986
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE international conference on computer vision, pp 815–823
Zhuo J, Chen Z, Lai J, Wang G (2018) Occluded person re-identification. In: Proceedings of the IEEE international conference on multimedia and expo, pp 1–6
Wang T, Liu H, Song P, Guo T, Shi W (2022) Pose-guided feature disentangling for occluded person re-identification based on transformer. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, no 3, pp 2540–2549
Li Y, He J, Zhang T, Liu X, Zhang Y, Wu F (2021) Diverse part discovery: occluded person re-identification with part-aware transformer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2898–2907
Weston J, Chopra S, Bordes A (2015) Memory networks. In: Proceedings of the international conference on learning representation, pp 1–15
Miller A, Fisch A, Dodge J, Karimi A, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. In Proceedings of the 2016 conference on empirical methods in natural language processing, pp 1400–1409
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4080–4088
Oh S, Lee J, Xu N, Kim S (2019) Video object segmentation using space-time memory networks. In: Proceedings of the IEEE international conference on computer vision, pp 12016–12025
Eom C, Lee G, Lee J, Ham B (2021) Video-based person re-identification with spatial and temporal memory networks. In: Proceedings of the IEEE international conference on computer vision, pp 12016–12025
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1335–1344
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 7, pp 13001–13008
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE international conference on computer vision, pp 9992–10002
Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88
Ristani E, Solera F, Zou RS, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of the European conference on computer vision, pp 17–35
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp 3754–3762
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Zheng W, Li X, Xiang T,Liao S, Lai J, Gong S (2015) Partial person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4678–4686
Zhou K, Yang Y, Cavallaro A, Xiang T (2019) Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3702–3712
Zhu K, Guo H, Liu Z, Tang M, Wang J (2020) Identity-guided human semantic parsing for person re-identification. In: Proceedings of the European conference on computer vision, pp 346–363
Luo C, Chen Y, Wang N, Zhang Z (2019) Spectral feature transformation for person re-identification. In: Proceedings of the European conference on computer vision, pp 4975–4984
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) Interaction-and-aggregation network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9309–9318
Shu X, Yuan D, Liu Q, Liu J (2020) Adaptive weight part-based convolutional network for person re-identification. Multimedia Tools Appl 79(31):23617–23632
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was funded by the Project of National Natural Science Foundation of China under Grant No. 62106023.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, Y., Liu, S., Sun, Z. et al. Single-scale robust feature representation for occluded person re-identification. Neural Comput & Applic 35, 22551–22562 (2023). https://doi.org/10.1007/s00521-023-08770-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08770-z