Unsupervised domain adaption for image-to-video person re-identification

Zhang, Xinyu; Li, Sen; Jing, Xiao-Yuan; Ma, Fei; Zhu, Chen

doi:10.1007/s11042-019-08550-9

Unsupervised domain adaption for image-to-video person re-identification

Published: 16 January 2020

Volume 79, pages 33793–33810, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xinyu Zhang ORCID: orcid.org/0000-0002-9109-1889¹,
Sen Li¹,
Xiao-Yuan Jing^1,2,3,
Fei Ma¹ &
…
Chen Zhu¹

374 Accesses
8 Citations
Explore all metrics

Abstract

Recently, person re-identification technique has been successfully applied to many fields, such as suspect tracking and lost human location. As video always contains more valuable information, more and more researchers focus on video based person re-identification, especially in image-to-video person re-identification (IVPR). However, most of existing IVPR models are under the supervised framework. In fact, marking enough training samples will cost numbers of labors, which limits the practical value of them. At the same time, the 2D features extracted from pedestrian image and 3D features extracted from pedestrian video are heterogeneous, which brings significant challenge for IVPR task. To effective solve the above problems, we propose an unsupervised domain adaption image-to-video person re-identification model by cross-modal feature generating and target information preserving transfer network (CMGTN). On one hand, the designed generator in our model can not only transform target domain unlabeled sample features into source domain feature space, but also can preserve target identity information. On the other hand, we eliminate the gap between pedestrian images and videos by embedding a cross-modal loss term. To evaluate the performance of our approach, we conduct extensive experiments on PRID-2011, iLIDS-VID and MARS datasets, and compare our approach with existing state-of-the-art IVPR models including four unsupervised methods and three supervised methods. Experimental results demonstrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-information Constraint Learning for Unsupervised Domain Adaptive Person Re-identification

Article 26 May 2022

Chen Dongyue, Bing Haozhe, … Jia Tong

Progressive spatial–temporal transfer model for unsupervised person re-identification

Article 03 April 2024

Shuren Zhou, Zhixiong Li, … Jianming Zhang

POSNet: a hybrid deep learning model for efficient person re-identification

Article 22 March 2023

Eliza Batool, Saira Gillani, … Seungmin Rho

References

Bak S, Corvee E, Bremond F, Thonnat M (2010) Person re-identification using haar-based and DCD-based signature. In: IEEE conference on advanced video and signal-based surveillance, pp 1–8
Baltieri D, Vezzani R, Cucchiara R (2013) Learning articulated body models for people re-identification. In: ACM international conference on multimedia, pp 557–560
Chen Y, Zhu X, Gong S (2018) Deep association learning for unsupervised video person re-identification. British machine vision conference, p 48
Dai J, Zhang P, Wang D, Lu H, Wang H (2019) Video person re-identification by temporal residual learning. IEEE Trans Image Process 28(3):1366–1377
Article MathSciNet Google Scholar
Fan H, Zheng L, Yang Y (2018) Unsupervised person re-identification: Clustering and fine-tuning. ACM Trans Multimed Computing Commun Appl 14(4):8:1-8:18
Article Google Scholar
Gong Y, Ke Q, Isard M, Lazebnik SA (2014) Multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106(2):210–233
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, et al (2014) Generative adversarial networks. Adv Neural Inf Processing Sys 3:2672–2680
Google Scholar
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis, pp 91–102
Huang W, Liang C, Yu Y, Wang Z, Ruan W, Hu R (2018) Video-based person re-identification via self paced weighting. In: AAAI conference on artificial intelligence (AAAI), pp 2273–2280
Jing X, Zhang X, Zhu X, Wu F, You X, Gao Y, Shan S, Yang J (2019) Multiset feature learning for highly imbalanced data classification. IEEE transactions on pattern analysis and machine intelligence, https://doi.org/10.1109/TPAMI.2019.2929166
Kodirov E, Xiang T, Fu Z-Y, et al (2016) Person re-identification by unsupervised graph learning. In: European conference on computer vision, pp 178–195
Li X, Yin H, Zhou K, Zhou X (2019) Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web, https://doi.org/10.1007/s11280-019-00723-8
Liao S, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric learning. In: IEEE conference on computer vision and pattern recognition, pp 2197–2206
Liong V, Lu J, Tan Y-P, Zhou J (2017) Deep coupled metric learning for cross-modal matching. IEEE Trans Multimed 19(6):1234–1244
Article Google Scholar
Lisanti G, Martinel N, Micheloni C, Bimbo AD, Foresti GL (2019) From person to group re-identification via unsupervised transfer of sparse features. Image Vis Comput 83-84:29–38
Article Google Scholar
Lisanti G, Masi I, Bagdanov DA, Bimbo A (2015) Person re-identification by iterative re-weighted sparse ranking. IEEE Trans Pattern Anal Mach Intell 37(8):1629–1642
Article Google Scholar
Liu K, Ma B, Zhang W, Huang RA (2015) Spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: IEEE international conference on computer vision, pp 3810–3818
Liu Z, Wang D, Lu H (2017) Stepwise metric promotion for unsupervised video person re-identification. In: IEEE international conference on computer vision, pp 2448–2457
Lv J, Weihang Chen W, Qing Li Q, Can Yang C (2018) Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns, pp 7948–7956
Peng P, Xiang T, Wang Y, et al (2016) Unsupervised cross-dataset transfer learning for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1306–1315
Schroff F, Kalenichenko D, Philbin J (2015) FaceNet: a unified embedding for face recognition and clustering. In: IEEE conference on computer vision and pattern recognition, pp 815–823
Taigman Y, Polyak A, Wolf L (2016) Unsupervised cross-domain image generation, CoRR
Tian J, Teng Z, Li R, Li Y, Zhang B, Fan J (2019) Imitating targets from all sides: an unsupervised transfer learning method for person re-identification. arXiv:1904.05020
Wang C, Yang H, Meinel C (2016) A deep semantic framework for multimodal representation learning. Multimed Tools Appl 75(15):9255–9276
Article Google Scholar
Wang G, Lai J, Xie X (2018) P2SNet: Can an image match a video for person re-identification in an end-to-end way? IEEE Trans Circ Sys Vid Technol 28 (10):2777–2787
Article Google Scholar
Wang J, Zhu X, Gong S, Li W (2018) Transferable joint Attribute-Identity deep learning for unsupervised person re-identification. In: CVPR, pp 2275–2284
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: ECCV, pp 688–703
Wu J, Liao S, Lei Z, Wang X, Yang Y, Li S (2019) Clustering and dynamic sampling based unsupervised domain adaptation for person re-identification. In: IEEE international conference on multimedia and expo, pp 886–891
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: IEEE conference on computer vision and pattern recognition, pp 1249–1258
Yan F, Mikolajczyk K. (2015) Deep correlation for matching images and text. In: IEEE conference on computer vision and pattern recognition, pp 3441–3450
Ye M, Ma A, Zheng L, et al (2017) Dynamic label graph matching for unsupervised video re-identification. In: IEEE international conference on computer vision, pp 5152–5160
Yu B, Xu N (2019) Urgent image-to-video person reidentification by cross-media transfer cycle generative adversarial networks. J Electronic Imaging 28(1):013052
Article Google Scholar
Yu H, Wu A, Zheng W (2017) Cross-view asymmetric metric learning for unsupervised person re-identification, in IEEE international conference on computer vision, pp 994–1002
Zhang D, Wu W, Cheng H, et al (2018) Image-to-video person re-identification with temporally memorized similarity learning. IEEE Trans Circ Syst Vid Technol 28(10):2622–2632
Article Google Scholar
Zhang Y, Li S (2011) Gabor-LBP based region covariance descriptor for person re-identification. In: IEEE conference on image and graphics, pp 368–371
Zheng L, Bie Z, Sun Y, et al (2016) MARS: a video benchmark for large-scale person re-identification. In: European conference on computer vision, pp 868–884
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable Person Re-identification: a benchmark. In: IEEE conference onon computer vision, pp 1116–1124
Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: IEEE conference on computer vision and pattern recognition, pp 3346–3355
Zheng W, Gong S, Xiang T (2009) Associating groups of people. In: British machine vision conference, pp 2–6
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: IEEE international conference on computer vision, pp 3774–3782
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE conference computer vision and pattern recognition, pp 6776–6785
Zhu X, Jing X -Y, Wu F, et al (2016) Distance learning by treating negative samples differently and exploiting impostors with symmetric triplet constraint for person re-identification. In: IEEE international conference on multimedia and expo, pp 1–6
Zhu X, Jing X -Y, You X, et al (2018) Image to video person re-identification by learning heterogeneous dictionary pair with feature projection matrix. IEEE Trans Info Foren Sec 13(3):717–732
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the editor, the associate editor, and anonymous reviewers for their constructive comments in helping improve our work. This work was supported by the NSFC-Key Project under Grant No. 61933013, the NSFC-Key Project of General Technology Fundamental Research United Fund under Grant No. U1736211, the Key Project of Natural Science Foundation of Hubei Province under Grant No. 2018CFA024, the Natural Science Foundation of Guangdong Province under Grant No. 2019A1515011076.

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, 430072, China
Xinyu Zhang, Sen Li, Xiao-Yuan Jing, Fei Ma & Chen Zhu
School of Computer, Guangdong University of Petrochemical Technology, Maoming, 525000, China
Xiao-Yuan Jing
College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China
Xiao-Yuan Jing

Authors

Xinyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Sen Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yuan Jing
View author publications
You can also search for this author in PubMed Google Scholar
Fei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Chen Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Yuan Jing.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Li, S., Jing, XY. et al. Unsupervised domain adaption for image-to-video person re-identification. Multimed Tools Appl 79, 33793–33810 (2020). https://doi.org/10.1007/s11042-019-08550-9

Download citation

Received: 22 March 2019
Revised: 16 September 2019
Accepted: 27 November 2019
Published: 16 January 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-019-08550-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised domain adaption for image-to-video person re-identification

Abstract

Access this article

Similar content being viewed by others

Multi-information Constraint Learning for Unsupervised Domain Adaptive Person Re-identification

Progressive spatial–temporal transfer model for unsupervised person re-identification

POSNet: a hybrid deep learning model for efficient person re-identification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised domain adaption for image-to-video person re-identification

Abstract

Access this article

Similar content being viewed by others

Multi-information Constraint Learning for Unsupervised Domain Adaptive Person Re-identification

Progressive spatial–temporal transfer model for unsupervised person re-identification

POSNet: a hybrid deep learning model for efficient person re-identification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation