Skip to main content
Log in

Visualization of spatial matching features during deep person re-identification

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Person re-identification (Re-ID) based on deep learning has made great progress and achieved state-of-the-art performance in recent years. However, the end-to-end properties of deep neural networks allow us to directly feedback the output results based on its input, making the inner working mechanism of the deep person Re-ID model and its decision reasons lack of transparency and explainability. This further impedes improvements to pedestrian recognition performance. As feature visualization has been proven to be an effective method for characterizing the middle layer of a neural network, we propose a novel gradient-based visualization method to interpret the internal features learned by deep person Re-ID. Based on the idea of transfer learning, this model regards the pretrained ResNet-50 on the ImageNet dataset as a basic network for deep person Re-ID. First, the network is fine-tuned on the person Re-ID dataset to achieve pedestrian classification, and then, the gradient-based visualization of the trained network is performed to highlight important regions contributing to image similarity. Experiments conducted on the Market-1501 dataset verify that our model can not only enable the network to identify key features of an individual across different images, but also provide visual interpretation for the pedestrian classification results to improve the reliability of person Re-ID and foster trust from users regarding its decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Ahmed E, Jones MJ, Marks TK (2015) An improved deep learning architecture for person re-identification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3908–3916

  • Bak S, Corvée E, Brémond F, Thonnat M (2010) Person re-identification using spatial covariance regions of human body parts. In: 7th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 435–440

  • Chen D, Yuan Z, Chen B, Zheng N (2016) Similarity learning with spatial constraints for person re-identification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1268–1277

  • Chen W, Chen X, Zhang J, Huang K (2017) A multi-task deep network for person re-identification. In: 31st AAAI conference on artificial intelligence (AAAI), pp 3988–3994

  • Chen D, Zhang S, Ouyang W, Yang J, Tai Y (2018a) Person search via a mask-guided two-stream CNN model. In: Computer vision—ECCV 2018, proceedings, Part VII, pp 764–781

    Chapter  Google Scholar 

  • Chen L, Zhu Y, Papandreou G, Schroff F, Adam H (2018b) Encoder-decoder with atrous separable convolution for semantic image segmentation. Computer vision—ECCV 2018, pp 833–851

  • Cheng D, Gong Y, Li Z, Zhang D, Shi W, Zhang X (2018) Cross-scenario transfer metric learning for person re-identification. Pattern Recognit Lett. https://doi.org/10.1016/j.patrec.2018.04.023

    Article  Google Scholar 

  • Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255

  • Erhan D, Bengio Y, Courville AC, Vincent P (2009) Visualizing higher-layer features of a deep network. Technical Report, Univeristé de Montréa

  • Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 2360–2367

  • Gala A, Shah SK (2012) Part-based spatio-temporal model for multi-person re-identification. Pattern Recognit Lett 33(14):1908–1915

    Article  Google Scholar 

  • Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Computer vision—ECCV 2008, Proceedings, Part I 5302, pp 262–275

    Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  • Huang Y, Sheng H, Zheng Y, Xiong Z (2017) DeepDiff: learning deep difference features on human body parts for person re-identification. Neurocomputing 241:191–203

    Article  Google Scholar 

  • Jere M, Hitaj B, Cretu-Ciocarlie GF, Koushanfar F (2019) Scratch that! An evolution-based adversarial attack against neural networks [Online]. https://arxiv.org/abs/1912.02316v1. Accessed 5 Dec 2019

  • Karmon D, Zoran D, Goldberg Y (2018) LaVAN: localized and visible adversarial noise [Online]. https://arxiv.org/abs/1801.02608. Accessed 1 Mar 2018

  • Köstinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2288–2295

  • Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. In: 2018 IEEE conference on computer vision and pattern recognition (CVPR), pp 2285–2294

  • Ma B, Su Y, Jurie F (2012) BiCov: a novel image representation for person re-identification and face verification. In: Proceedings British machine vision conference 2012 (BMVC). http://dx.doi.org/10.5244/C.26.57

  • Mahendran A, Vedaldi A (2015) Understanding deep image representations by inverting them. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 5188–5196

  • Mignon A, Jurie F (2012) PCCA: a new approach for distance learning from sparse pairwise constraints. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2666–2672

  • Nanda A, Sa PK, Chauhan DS, Majhi B (2019) A person re-identification framework by inlier-set group modeling for video surveillance. J Ambient Intell Humaniz Comput 10(1):13–25

    Article  Google Scholar 

  • Newton EM, Sweeney L, Malin B (2005) Preserving privacy by de-identifying face images. IEEE Trans Knowl Data Eng 17(2):232–243

    Article  Google Scholar 

  • Oh SJ, Benenson R, Fritz M, Schiele B (2016) Faceless person recognition: privacy implications in social media. In: Computer vision—ECCV 2016, Part III 9907, pp 19–35

  • Paszke A, Gross S, Chintala S, Chanan G, Yang E, Devito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in PyTorch. In: 31st conference on neural information processing syst automatic differentiation in PyTorch ems (NIPS)

  • Qian X, Fu Y, Xiang T, Wang W, Qiu J, Wu Y, Jiang Y, Xue X (2017) Pose-normalized image generation for person re-identification. In: Computer vision—ECCV 2018, Part IX 11213, pp 661–678

    Chapter  Google Scholar 

  • Qin Z, Yu F, Liu C, Chen X (2018) How convolutional neural network see the world—a survey of convolutional neural network visualization methods. Math Found Comput 1(2):149–180

    Article  Google Scholar 

  • Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  • Ryoo MS, Rothrock B, Fleming C, Yang HJ (2016) Privacy-preserving human activity recognition from extreme low resolution. In: 31st AAAI conference on artificial intelligence (AAAI), pp 4255–4262

  • Seifert C, Aamir A, Balagopalan A, Jain D, Sharma A, Grottel S, Gumhold S (2017) Visualizations of deep neural networks in computer vision: a survey. Transparent Data Mining for Big and Small Data, pp 123–144

  • Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE international conference on computer vision (ICCV), pp 618–626

  • Shen Y, Li H, Yi S, Chen D, Wang X (2018) Person re-identification with deep similarity-guided graph neural network. In: Computer vision—ECCV 2018, 11219, pp 508–526

  • Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: visualising image classification models and saliency maps [Online]. https://arxiv.org/abs/1312.6034v2. Accessed 19 Apr 2014

  • Springenberg JT, Dosovitskiy A, Brox T, Riedmiller MA (2015) Striving for simplicity: the all convolutional net [Online]. https://arxiv.org/abs/1412.6806v3. Accessed 13 Apr 2015

  • Stylianou A, Souvenir R, Pless R (2019) Visualizing deep similarity networks. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 2029–2037

  • Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and A strong convolutional baseline). In: Computer vision—ECCV 2018, Part IV 11208, pp 501–518

    Chapter  Google Scholar 

  • Varga D, Szirányi T (2017) Robust real-time pedestrian detection in surveillance videos. J Ambient Intell Humaniz Comput 8(1):79–85

    Article  Google Scholar 

  • Varior RR, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human re-identification. In: Computer vision—ECCV 2016, Part VIII 9912, pp 791–808

  • Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on Multimedia, pp 274–282

  • Wang R, Gu D, Wen Z, Yang K, Liu S, Jiang Feng (2019) Intra-class classification of architectural styles using visualization of CNN. In: International conference on artificial intelligence and security, pp 205–216

    Google Scholar 

  • Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) GLAD: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM international conference on Multimedia, pp 420–428

  • Wu C, Li Y, Zhao Z, Liu B (2019) Research on image classification method of features of combinatorial convolution. J Ambient Intell Humaniz Comput 1–11

  • Wu Y, Lin Y, Dong X, Yan Y, Bian W, Yang Y (2019b) Progressive learning for person re-identification with one example. IEEE Trans Image Process 28(6):2872–2881

    Article  MathSciNet  Google Scholar 

  • Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification [Online]. https://arxiv.org/abs/1710.00478v3. Accessed 7 Oct 2017

  • Yi D, Lei Z, Liao S, Li SZ (2014) Deep metric learning for person re-identification. In: 22nd international conference on pattern recognition, pp 34–39

  • Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional neural networks. In: Computer vision–ECCV 2014, Part I 8689, pp 818–833

  • Zhang Q, Zhu S (2018) Visual interpretability for deep learning: a survey. Front Inf Technol Electron Eng 19(1):27–39

    Article  Google Scholar 

  • Zhang L, Xiang T, Gong S (2016) Learning a discriminative null space for person re-identification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1239–1248

  • Zhao L, Li X, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. In: 2017 IEEE international conference on computer vision (ICCV), pp 3239–3248

  • Zhao C, Chen K, Wei Z, Chen Y, Miao D, Wang W (2019) Multilevel triplet deep learning model for person re-identification. Pattern Recogn Lett 117:161–168

    Article  Google Scholar 

  • Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: 2015 IEEE international conference on computer vision (ICCV), pp 1116–1124

  • Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: past, present and future [Online]. https://arxiv.org/abs/1610.02984. Accessed 10 Oct 2016

  • Zheng Z, Zheng L, Yang Y (2017a) A discriminatively learned CNN embedding for person reidentification. ACM Trans Multimed Comput Commun Appl (TOMM) 14(1):13:1–13:20

    MathSciNet  Google Scholar 

  • Zheng Z, Zheng L, Yang Y (2017b) Pedestrian alignment network for large-scale person re-identification. IEEE Trans Circuits Syst Video Technol 29:3037–3045

    Article  Google Scholar 

  • Zheng Z, Zheng L, Yang Y (2017c) Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: 2017 IEEE international conference on computer vision (ICCV), pp 3774–3782

  • Zhou B, Khosla A, Lapedriza À, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2921–2929

Download references

Acknowledgement

This work was supported by the National Science Foundation of China (No. 61572517) and the Foundation of Nature Science of Guangdong (2015A030310172).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Dongning Zhao, Li Li or Rongyu He.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, H., Zhao, D., Wu, C.H. et al. Visualization of spatial matching features during deep person re-identification. J Ambient Intell Human Comput (2020). https://doi.org/10.1007/s12652-020-01754-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12652-020-01754-0

Keywords

Navigation