Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 6, pp 6607–6636 | Cite as

Deep learning with particle filter for person re-identification

  • Gwangmin ChoeEmail author
  • Chunhwa Choe
  • Tianjiang WangEmail author
  • Hyoson So
  • Cholman Nam
  • Caihong Yuan
Article
  • 92 Downloads

Abstract

Person re-identification, having attracted much attention in the multimedia community, is still challenged by the accuracy and the robustness, as the images for the verification contain such variations as light, pose, noise and ambiguity etc. Such practical challenges require relatively robust and accurate feature learning technologies. We introduced a novel deep neural network with PF-BP(Particle Filter-Back Propagation) to achieve relatively global and robust performances of person re-identification. The local optima in the deep networks themselves are still the main difficulty in the learning, in despite of several advanced approaches. A novel neural network learning, or PF-BP, was first proposed to solve the local optima problem in the non-convex objective function of the deep networks. When considering final deep network to learn using BP, the overall neural network with the particle filter will behave as the PF-BP neural network. Also, a max-min value searching was proposed by considering two assumptions about shapes of the non-convex objective function to learn on. Finally, a salience learning based on the deep neural network with PF-BP was proposed to achieve an advanced person re-identification. We test our neural network learning with particle filter aimed to the non-convex optimization problem, and then evaluate the performances of the proposed system in a person re-identification scenario. Experimental results demonstrate that the corresponding performances of the proposed deep network have promising discriminative capability in comparison with other ones.

Keywords

Deep learning Non-convex objective function Global optimum Particle filter Back-propagation Person re-identification 

Notes

References

  1. 1.
    Bengio Y (2009) Learning deep architectures for AI. In: Foundations and Trends in Machine Learning, vol 2, pp 1–127MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bottou L, LeCun Y (2004) Large scale online learning. Proceedings NIPS, 2004Google Scholar
  3. 3.
    Choe G et al (2016) An advanced association of particle filtering and kernel based object tracking. Multimedia Tools and Applications, 74(18)CrossRefGoogle Scholar
  4. 4.
    Choe G et al (2016) Combined Salience based Person Re-identification. Multimedia Tools and Applications, 75(18)CrossRefGoogle Scholar
  5. 5.
    Cui J et al (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Trans. on Systems, Man and Cybernetics 43(4)CrossRefGoogle Scholar
  6. 6.
    Dahl G et al (2010) Phone recognition with the mean-covariance restricted Boltzmann machine. Proc NIPS 23:469–477Google Scholar
  7. 7.
    Dahl G et al (2011) Context-dependent DBN-HMMs in large vocabulary continuous speech recognition. Proceedings ICASSPGoogle Scholar
  8. 8.
    Dahl G et al (2012) Context-dependent, pre-trained deep neural networks for large voca-bulary speech recognition. IEEE Trans Audio Speech Language Proc 20(1):30–42CrossRefGoogle Scholar
  9. 9.
    Ding S, Lin L, Wang G, Chao H (2015) Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn 18:2993–3003CrossRefGoogle Scholar
  10. 10.
    Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: CVPR. IEEE, pp 2360-2367Google Scholar
  11. 11.
    Gao Y et al (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans. on Image Processing 26(5)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Gray D, Tao H (2008) Viewpoint in variant pedestrian recognition with an ensemble of localized features. In: ECCV. Springer, pp 262-275Google Scholar
  13. 13.
    Hinton G et al (2012) Deep Neural Networks for Acoustic Modeling in Speech Recogni-tion. IEEE Signal Proc Mag 29(6):82–97CrossRefGoogle Scholar
  14. 14.
    Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097-1105Google Scholar
  15. 15.
    Layne R, Hospedales T, Gong S (2012) Towards person identification and re-identification with attributes. In: ECCV. Springer, pp 402-412Google Scholar
  16. 16.
    Li W, Zhao R, Wang X (2013) Human re-identification with transferred metric learning. In: ACCV. Springer, pp 31-44Google Scholar
  17. 17.
    Li W, Zhao R, Xiao T, Wang X (2014) Deep reid:Deep filter pairing neural network for person re-identification. In: CVPR, pp 152-159Google Scholar
  18. 18.
    Li Y et al (2018) Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans. on Geoscience and Remote Sensing 56(2)CrossRefGoogle Scholar
  19. 19.
    Li Z, Chang S, Liang F, Huang T, Cao L, Smith J (2013) Learning locally-adaptive decision functions for person verification. In: CVPR. IEEE, pp 3610-3617Google Scholar
  20. 20.
    Li Z et al (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. on Image Processing 17(11)CrossRefGoogle Scholar
  21. 21.
    Li Z et al (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans. on Image Processing 26(1)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Lin L, Liu X, Zhu S (2010) Layered graph matching with composite cluster sampling. IEEE Trans Pattern Anal Mach Intell 32(8):1426–1442CrossRefGoogle Scholar
  23. 23.
    Lin L, Wu T, Porway J, Xu Z (2009) A stochastic graph grammar for compositional object representation and recognition. Pattern Recogn 42(7):1297–1307CrossRefGoogle Scholar
  24. 24.
    Lin L, Zhang R, Duan X (2015) Adaptive scene category discovery with generative learning and compositional sampling. IEEE Trans Circuits Syst Video Technol 25(2):251–260CrossRefGoogle Scholar
  25. 25.
    Liu H, Ma B, Qin L, Pang J, Zhang C, Huang Q (2015) Set-label modeling anddeep metric learning on person re-identification. Neurocomputing 151:1283–1292CrossRefGoogle Scholar
  26. 26.
    Liu L et al (2016) Recognizing complex activities by a probabilistic interval-based model. National Conference on Artificial Intelligence(AAAI)Google Scholar
  27. 27.
    Liu W et al (2017) Multiview dimension reduction via Hessian multiset canonical correlations. Information Fusion.  https://doi.org/10.1016/j.inffus.2017.09.001 CrossRefGoogle Scholar
  28. 28.
    Liu Y et al (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. ICPR pp 898–901Google Scholar
  29. 29.
    Liu Y et al (2016) From action to activity: sensor-baesd activity recognition. Neurocomputing 181(12)CrossRefGoogle Scholar
  30. 30.
    Ma J et al (2015) Non-rigid visible and infrared face registration via regularized Gaussian fields criterion. Pattern Recogn 48(3)CrossRefGoogle Scholar
  31. 31.
    Martens J (2010) Deep learning with Hessian-free optimization. Proceedings ICMLGoogle Scholar
  32. 32.
    Mohamed A et al (2010) Investigation of full-sequence training of deep belief networks for speech recognition. Proc. InterspeechGoogle Scholar
  33. 33.
    Mohamed A et al (2012) Acoustic Modeling Using Deep Belief Networks. IEEE Trans. Audio, Speech, Language Proc. 20(1)CrossRefGoogle Scholar
  34. 34.
    Ranzato M, Boureau Y, LeCun Y (2007) Sparse Feature Learning for Deep Belief Networks, Proceedings NIPSGoogle Scholar
  35. 35.
    Rifai S et al (2011) Contractive autoencoders: Explicit invariance during feature extrac-tion. Proceedings ICML, pp 833–840Google Scholar
  36. 36.
    Seide F et al (2011) Feature engineering in context-dependent deep neural networks for conversational speech transcription. Proceedings ASRU, pp 24–29Google Scholar
  37. 37.
    Seide F, Li G, Yu D (2011) Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. Interspeech, pp 437–440Google Scholar
  38. 38.
    Vincent P et al (2010) Stacked denoising autoencoders: Leaning useful representations in a deep network with a local denoising criterion. J Machine Learning Research 11:3371–3408MathSciNetzbMATHGoogle Scholar
  39. 39.
    Vinyals O, Povey D (2012) Krylov Subspace Descent for Deep Learning. Proceedings AISTATGoogle Scholar
  40. 40.
    Wang J, Leung T, Rosenberg C, Wang J, Philbin J, Chen B, Wu Y et al (2014) Learning fine-grained image similarity with deep ranking. In: CVPRGoogle Scholar
  41. 41.
    Wang X, Doretto G, Sebastian T, Rittscher J, Tu P (2007) Shape and appearance context modeling. In: ICCV. IEEE, pp 1-8Google Scholar
  42. 42.
    Wang Z et al (2016) Zero-shot person re-identification via cross-view consistency. IEEE Trans. on Multimedia 18(2)CrossRefGoogle Scholar
  43. 43.
    Wang Z et al (2017) Statistical inference of Gaussian-Laplace distribution for person verification. ACM Multimedia(MM) pp 1607–1617Google Scholar
  44. 44.
    Yang X et al (2017) Canonical correlation analysis networks for two-view image recognition. Inf Sci 385-386:338–352CrossRefGoogle Scholar
  45. 45.
    Yu D, Deng L, Li G, Seide F (2011) Discriminative pre-training of deep neural networks U.S. Patent FilingGoogle Scholar
  46. 46.
    Zhao R, Ouyang W, Wang X (2013) Unsupervised salience learning for person re-identification. In: CVPR. pp 4321–4328Google Scholar
  47. 47.
    Zhao Z et al (2017) Remarkable local resampling based on particle filter for visual tracking. Multimedia Tools and Applications, 76(1)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Visual Information Processing Laboratory, School of Computer Science and TechnologyKim Il Sung UniversityPyongyangDemocratic People’s Republic of Korea
  2. 2.Intelligent and Distributed Computing Laboratory, School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhanChina
  3. 3.Information and Communication Laboratory, School of Computer Science and TechnologyKim Il Sung UniversityPyongyangDemocratic People’s Republic of Korea

Personalised recommendations